| |
Last updated on September 26, 2023. This conference program is tentative and subject to change
Technical Program for Wednesday October 4, 2023
|
WeAT1 Regular session, 140A |
Add to My Program |
HRI - Learning |
|
|
Chair: Leite, Iolanda | KTH Royal Institute of Technology |
Co-Chair: Gombolay, Matthew | Georgia Institute of Technology |
|
08:30-08:36, Paper WeAT1.1 | Add to My Program |
Primitive Skill-Based Robot Learning from Human Evaluative Feedback |
|
Hiranaka, Ayano | Stanford University |
Hwang, Minjune | Stanford University |
Lee, Sharon | Stanford University |
Wang, Chen | Stanford University |
Fei-Fei, Li | Stanford University |
Wu, Jiajun | Stanford University |
Zhang, Ruohan | Stanford University |
Keywords: Human Factors and Human-in-the-Loop, Reinforcement Learning, Machine Learning for Robot Control
Abstract: Reinforcement learning (RL) algorithms face sig- nificant challenges when dealing with long-horizon robot ma- nipulation tasks in real-world environments due to sample inefficiency and safety issues. To overcome these challenges, we propose a novel framework, SEED, which leverages two approaches: reinforcement learning from human feedback (RLHF) and primitive skill-based reinforcement learning. Both approaches are particularly effective in addressing sparse re- ward issues and the complexities involved in long-horizon tasks. By combining them, SEED reduces the human effort required in RLHF and increases safety in training robot manipulation with RL in real-world settings. Additionally, parameterized skills provide a clear view of the agent’s high-level intentions, allowing humans to evaluate skill choices before they are executed. This feature makes the training process even safer and more efficient. To evaluate the performance of SEED, we conducted extensive experiments on five manipulation tasks with varying levels of complexity. Our results show that SEED significantly outperforms state-of-the-art RL algorithms in sample efficiency and safety. In addition, SEED also exhibits a substantial reduction of human effort compared to other RLHF methods. Further details and video results can be found at https: //seediros23.github.io/.
|
|
08:36-08:42, Paper WeAT1.2 | Add to My Program |
Autocomplete of 3D Motions for UAV Teleoperation |
|
Ibrahim, Batool | American University of Beirut AUB |
Haj Hussein, Mohammad | American University of Beirut |
Elhajj, Imad | American University of Beirut |
Asmar, Daniel | American University of Beirut |
Keywords: Human-Robot Collaboration, Deep Learning Methods
Abstract: Tele-operating aerial vehicles without any automated assistance is challenging due to various limitations, especially for inexperienced users. Autocomplete addresses this problem by automatically identifying and completing the user's intended motion. Such a framework uses machine learning to recognize and classify human inputs as one of a set of motion primitives, and then, if the human operator accepts, synthesizes the motion in order to complete the desired motion. This has been shown to improve the performance of the system and reduce operator workload. Previous Autocomplete systems focused on different 2D motions (line, arc, sine,..). However, since most UAVs tasks are in a 3D world, this paper introduces 3D Autocomplete for 3D motions. Moreover, the proposed framework presents just-in-time prediction of the 3D motions by proposing a change point detection technique, which allows the framework to autonomously identify when to conduct a prediction. Also, it deals with variable motion sizes. Real time simulation results show that the proposed framework is capable of predicting the user intentions after change point detection.
|
|
08:42-08:48, Paper WeAT1.3 | Add to My Program |
Exploiting Spatio-Temporal Human-Object Relations Using Graph Neural Networks for Human Action Recognition and 3D Motion Forecasting |
|
Lagamtzis, Dimitrios | Esslingen University of Applied Sciences |
Schmidt, Fabian | Esslingen University of Applied Sciences |
Seyler, Jan Reinke | Festo SE & Co. KG |
Dang, Thao | Daimler AG |
Schober, Steffen | Esslingen University |
Keywords: Human-Robot Collaboration, Intention Recognition, Industrial Robots
Abstract: Human action recognition and motion forecasting is becoming increasingly successful, in particular with utilizing graphs. We aim to transfer this success into the context of industrial Human-Robot Collaboration (HRC), where humans work closely with robots and interact with workpieces in defined workspaces. For this purpose, it is necessary to use all the available information extractable in such a workspace and represent it with a natural structure, such as graphs, that can be used for learning. Since humans are the center of HRC, it is mandatory to construct the graph in a human-centered way and use real-world 3D information as well as object labels to represent their environment. Therefore, we present a novel Graph Neural Network (GNN) architecture which combines, human action recognition and motion forecasting for industrial HRC environments. We evaluate our method with two different and publicly available human action datasets, including one that is a particularly realistic representation of the industrial HRC, and compare the results with baseline methods for classifying the current human action and predicting the human motion. Our experiments show that our combined GNN approach improves the accuracy of action recognition compared to previous work, and significantly on the CoAx dataset by up to 20%. Further, our motion forecasting approach performs better than existing baselines, predicting human trajectories with a Final Displacement Error (FDE) of less than 10cm for a prediction horizon of 1s.
|
|
08:48-08:54, Paper WeAT1.4 | Add to My Program |
Improving Human-Robot Interaction Effectiveness in Human-Robot Collaborative Object Transportation Using Force Prediction |
|
Dominguez-Vidal, Jose Enrique | Institut De Robňtica I Informŕtica Industrial, CSIC-UPC |
Sanfeliu, Alberto | Universitat Politčcnica De Cataluyna |
Keywords: Human-Robot Collaboration, Physical Human-Robot Interaction, Deep Learning Methods
Abstract: In this work, we analyse the use of a prediction of the human's force in a Human-Robot collaborative object transportation task at a middle distance. We check that this force prediction can improve multiple parameters associated with effective Human-Robot Interaction (HRI) such as perception of the robot's contribution to the task, comfort or trust in the robot in a physical Human Robot Interaction (pHRI). We present a Deep Learning model that allows to predict the force that a human will exert in the next 1 s using as inputs the force previously exerted by the human, the robot's velocity and environment information obtained from the robot's LiDAR. Its success rate is up to 92.3% in testset and up to 89.1% in real experiments. We demonstrate that this force prediction, in addition to being able to be used directly to detect changes in the human's intention, can be processed to obtain an estimate of the human's desired trajectory. We have validated this approach with a user study involving 18 volunteers.
|
|
08:54-09:00, Paper WeAT1.5 | Add to My Program |
Leveraging Saliency-Aware Gaze Heatmaps for Multiperspective Teaching of Unknown Objects |
|
Weber, Daniel | University of Tübingen |
Bolz, Valentin | University of Tübingen |
Zell, Andreas | University of Tübingen |
Kasneci, Enkelejda | University of Tübingen |
Keywords: Human Factors and Human-in-the-Loop, Human-Robot Collaboration, Human-Centered Robotics
Abstract: As robots become increasingly prevalent amidst diverse environments, their ability to adapt to novel scenarios and objects is essential. Advances in modern object detection have also paved the way for robots to identify interaction entities within their immediate vicinity. One drawback is that the robot's operational domain must be known at the time of training, which hinders the robot's ability to adapt to unexpected environments outside the preselected classes. However, when encountering such challenges a human can provide support to a robot by teaching it about the new, yet unknown objects on an ad hoc basis. In this work, we merge augmented reality and human gaze in the context of multimodal human-robot interaction to compose saliency-aware gaze heatmaps leveraged by a robot to learn emerging objects of interest. Our results show that our proposed method exceeds the capabilities of the current state of the art and outperforms it in terms of commonly used object detection metrics.
|
|
09:00-09:06, Paper WeAT1.6 | Add to My Program |
Language Guided Temporally Adaptive Perception for Efficient Natural Language Grounding in Cluttered Dynamic Worlds |
|
Patki, Siddharth | University of Rochester |
Arkin, Jacob | Massachusetts Institute of Technology |
Raicevic, Nikola | University of Rochester |
Howard, Thomas | University of Rochester |
Keywords: Multi-Modal Perception for HRI, Human-Robot Collaboration, Natural Dialog for HRI
Abstract: As robots operate alongside humans in shared spaces, such as homes and offices, it is essential to have an effective mechanism for interacting with them. Natural language offers an intuitive interface for communicating with robots, but most of the recent approaches to grounded language understanding reason only in the context of an instantaneous state of the world. Though this allows for interpreting a variety of utterances in the current context of the world, these models fail to interpret utterances which require the knowledge of past dynamics of the world, thereby hindering effective human-robot collaboration in dynamic environments. Constructing a comprehensive model of the world that tracks the dynamics of all objects in the robot's workspace is computationally expensive and difficult to scale with increasingly complex environments. To address this challenge, we propose a learned model of language and perception that facilitates the construction of temporally compact models of dynamic worlds through closed-loop grounding and perception. Our experimental results on the task of grounding referring expressions demonstrate more accurate interpretation of robot instructions in cluttered and dynamic table-top environments without a significant increase in runtime as compared to an open-loop baseline.
|
|
09:06-09:12, Paper WeAT1.7 | Add to My Program |
T-Top, an Open Source Tabletop Robot with Advanced Onboard Audio, Vision and Deep Learning Capabilities |
|
Maheux, Marc-Antoine | Université De Sherbrooke |
Panchea, Adina Marlena | Université De Sherbrooke |
Warren, Philippe | Université De Sherbrooke |
Létourneau, Dominic | Université De Sherbrooke |
Michaud, Francois | Universite De Sherbrooke |
Keywords: Robot Companions, Deep Learning Methods, Multi-Modal Perception for HRI
Abstract: In recent years, studies on Socially Assistive Robots (SARs) examine how to improve the quality of life of people living with dementia and older adults (OAs) in general. However, most SARs have somewhat limited perception capabilities or interact using simple pre-programmed responses, providing limited or repetitive interaction modalities. Integrating more advanced perceptual capabilities with deep learning processing would help move beyond such limitations. This paper presents T-Top, a tabletop robot designed with advanced audio and vision processing using deep learning neural networks. T-Top is made available as an open source platform with the goal of providing an experimental SAR platform that can implement richer interaction modalities with OAs.
|
|
09:12-09:18, Paper WeAT1.8 | Add to My Program |
Learning Human Motion Intention for pHRI Assistive Control |
|
Franceschi, Paolo | CNR-STIIMA |
Bertini, Fabio | Politecnico Di Milano |
Braghin, Francesco | Politecnico Di Milano |
Roveda, Loris | SUPSI-IDSIA |
Pedrocchi, Nicola | National Research Council of Italy (CNR) |
Beschi, Manuel | University of Brescia |
Keywords: Machine Learning for Robot Control, Physical Human-Robot Interaction, Intention Recognition
Abstract: This work addresses human intention identifica- tion during physical Human-Robot Interaction (pHRI) tasks to include this information in an assistive controller. To this purpose, human intention is defined as the desired trajectory that the human wants to follow over a finite rolling prediction horizon so that the robot can assist in pursuing it. This work investigates a Recurrent Neural Network (RNN), specifically, Long-Short Term Memory (LSTM) cascaded with a Fully Connected layer. In particular, we propose an iterative training procedure to adapt the model. Such an iterative procedure is powerful in reducing the prediction error. Still, it has the drawback that it is time-consuming and does not generalize to different users or different co-manipulated objects. To overcome this issue, Transfer Learning (TL) adapts the pre-trained model to new trajectories, users, and co-manipulated objects by freezing the LSTM layer and fine-tuning the last FC layer, which makes the procedure faster. Experiments show that the iterative procedure adapts the model and reduces prediction error. Experiments also show that TL adapts to different users and to the co-manipulation of a large object. Finally, to check the utility of adopting the proposed method, We compare the proposed controller enhanced by the intention prediction with the other two standard controllers of pHRI.
|
|
09:18-09:24, Paper WeAT1.9 | Add to My Program |
VARIQuery: VAE Segment-Based Active Learning for Query Selection in Preference-Based Reinforcement Learning |
|
Marta, Daniel | KTH Royal Institute of Technology |
Holk, Simon | KTH Royal Institute of Technology |
Pek, Christian | Delft University of Technology |
Tumova, Jana | KTH Royal Institute of Technology |
Leite, Iolanda | KTH Royal Institute of Technology |
Keywords: Human Factors and Human-in-the-Loop, Reinforcement Learning, Representation Learning
Abstract: Human-in-the-loop reinforcement learning (RL) methods actively integrate human knowledge to create reward functions for various robotic tasks. Learning from preferences shows promise as alleviates the requirement of demonstrations by querying humans on state-action sequences. However, the limited granularity of sequence-based approaches complicates temporal credit assignment. The amount of human querying is contingent on query quality, as redundant queries result in excessive human involvement. This paper addresses the often-overlooked aspect of query selection, which is closely related to active learning (AL). We propose a novel query selection approach that leverages variational autoencoder (VAE) representations of state sequences. In this manner, we formulate queries that are diverse in nature while simultaneously taking into account reward model estimations. We compare our approach to the current state-of-the-art query selection methods in preference-based RL, and find ours to be either on-par or more sample efficient through extensive benchmarking on simulated environments relevant to robotics. Lastly, we conduct an online study to verify the effectiveness of our query selection approach with real human feedback and examine several metrics related to human effort.
|
|
09:24-09:30, Paper WeAT1.10 | Add to My Program |
Interactive Spatiotemporal Token Attention Network for Skeleton-Based General Interactive Action Recognition |
|
Wen, Yuhang | Sun Yat-Sen University |
Tang, Zixuan | Sun Yat-Sen University |
Pang, Yunsheng | Tencent Technology (Shenzhen) Co., Ltd., China |
Ding, Beichen | Sun Yat-Sen University |
Liu, Mengyuan | Sun Yat-Sen University |
Keywords: Human and Humanoid Motion Analysis and Synthesis, Deep Learning for Visual Perception, Human-Robot Collaboration
Abstract: Recognizing interactive action plays an important role in human-robot interaction and collaboration. Previous methods use late fusion and co-attention mechanism to capture interactive relations, which have limited learning capability or inefficiency to adapt to more interacting entities. With assumption that priors of each entity are already known, they also lack evaluations on a more general setting addressing the diversity of subjects. To address these problems, we propose an Interactive Spatiotemporal Token Attention Network (ISTA-Net), which simultaneously model spatial, temporal, and interactive relations. Specifically, our network contains a tokenizer to partition Interactive Spatiotemporal Tokens (ISTs), which is a unified way to represent motions of multiple diverse entities. By extending the entity dimension, ISTs provide better interactive representations. To jointly learn along three dimensions in ISTs, multi-head self-attention blocks integrated with 3D convolutions are designed to capture inter-token correlations. When modeling correlations, a strict entity ordering is usually irrelevant for recognizing interactive actions. To this end, Entity Rearrangement is proposed to eliminate the orderliness in ISTs for interchangeable entities. Extensive experiments on four datasets verify the effectiveness of ISTA-Net by outperforming state-of-the-art methods. Our code is publicly available at https://github.com/Necolizer/ISTA-Net.
|
|
09:30-09:36, Paper WeAT1.11 | Add to My Program |
Learning Joint Policies for Human-Robot Dialog and Co-Navigation |
|
Hayamizu, Yohei | SUNY Binghamton |
Yu, Zhou | Columbia University |
Zhang, Shiqi | SUNY Binghamton |
Keywords: Natural Dialog for HRI, Reinforcement Learning, Service Robotics
Abstract: Service robots need language capabilities for communicating with people, and navigation skills for beyond-proximity interaction in the real world. When the robot explores the real world with people side by side, there is the compound problem of human-robot dialog and co-navigation. The human-robot team uses dialog to decide where to go, and their shared spatial awareness affects the dialog state. In this paper, we develop a framework that learns a joint policy for human-robot dialog and co-navigation toward efficiently and accurately completing tour guide and information delivery tasks. We show that our approach outperforms baselines from the literature in task completion rate and execution time, and demonstrate our approach in the real world.
|
|
09:36-09:42, Paper WeAT1.12 | Add to My Program |
Natural Language Specification of Reinforcement Learning Policies through Differentiable Decision Trees |
|
Tambwekar, Pradyumna | Georgia Institute of Technology |
Silva, Andrew | Georgia Institute of Technology |
Gopalan, Nakul | Arizona State University |
Gombolay, Matthew | Georgia Institute of Technology |
Keywords: Human-Centered Automation, Human-Centered Robotics
Abstract: Human-AI policy specification is a novel procedure we define in which humans can collaboratively warm-start a robot's reinforcement learning policy. This procedure is comprised of two steps; (1) Policy Specification, i.e. humans specifying the behavior they would like their companion robot to accomplish, and (2) Policy Optimization, i.e. the robot applying reinforcement learning to improve the initial policy. Existing approaches to enabling collaborative policy specification are often unintelligible black-box methods, and are not catered towards making the autonomous system accessible to a novice end-user. In this paper, we develop a novel collaborative framework to enable humans to initialize and interpret an autonomous agent's behavior. Through our framework, we enable humans to specify an initial behavior model via unstructured, natural language, which we convert to lexical decision trees. Next, we leverage these translated human-specifications, to warm-start reinforcement learning and allow the agent to further optimize these potentially suboptimal policies. Our approach warm-starts an RL agent by utilizing non-expert natural language specifications without incurring the additional domain exploration costs. We validate our approach by showing that our model is able to produce >80% translation accuracy, and that policies initialized by a human are able match the performance of relevant RL baselines in two differing domains.
|
|
09:42-09:48, Paper WeAT1.13 | Add to My Program |
Robots Autonomously Detecting People: A Multimodal Deep Contrastive Learning Method Robust to Intraclass Variations |
|
Fung, Angus | University of Toronto |
Benhabib, Beno | University of Toronto |
Nejat, Goldie | University of Toronto |
Keywords: Human-Centered Robotics, Deep Learning for Visual Perception, Human Detection and Tracking
Abstract: Robotic detection of people in crowded and/or cluttered human-centered environments including hospitals, stores and airports is challenging as people can become occluded by other people or objects, and deform due to clothing or pose variations. There can also be loss of discriminative visual features due to poor lighting. In this paper, we present a novel multimodal person detection architecture to address the mobile robot problem of person detection under intraclass variations. We present a two-stage training approach using: 1) a unique pretraining method we define as Temporal Invariant Multimodal Contrastive Learning (TimCLR), and 2) a Multimodal YOLOv4 (MYOLOv4) detector for finetuning. TimCLR learns person representations that are invariant under intraclass variations through unsupervised learning. Our approach is unique in that it generates image pairs from natural variations within multimodal image sequences and contrasts crossmodal features to transfer invariances between different modalities. These pretrained features are used by the MYOLOv4 detector for finetuning and person detection from RGB-D images. Extensive experiments validate the performance of our DL architecture in both human-centered crowded and cluttered environments. Results show that our method outperforms existing unimodal and multimodal person detection approaches in detection accuracy when considering body occlusions and pose deformations in different lighting.
|
|
WeAT2 Regular session, 140B |
Add to My Program |
Human and Robot Teaming |
|
|
Chair: Fitter, Naomi T. | Oregon State University |
Co-Chair: Yang, X. Jessie | University of Michigan |
|
08:30-08:36, Paper WeAT2.1 | Add to My Program |
Initial Task Allocation for Multi-Human Multi-Robot Teams with Attention-Based Deep Reinforcement Learning |
|
Wang, Ruiqi | Purdue University |
Zhao, Dezhong | Beijing University of Chemical Technology |
Min, Byung-Cheol | Purdue University |
Keywords: Human-Robot Teaming, Task Planning, Human-Robot Collaboration
Abstract: Multi-human multi-robot teams have great potential for complex and large-scale tasks through the collaboration of humans and robots with diverse capabilities and expertise. To efficiently operate such highly heterogeneous teams and maximize team performance timely, sophisticated initial task allocation strategies that consider individual differences across team members and tasks are required. While existing works have shown promising results in reallocating tasks based on agent state and performance, the neglect of the inherent heterogeneity of the team hinders their effectiveness in realistic scenarios. In this paper, we present a novel formulation of the initial task allocation problem in multi-human multi-robot teams as contextual multi-attribute decision-make process and propose an attention-based deep reinforcement learning approach. We introduce a cross-attribute attention module to encode the latent and complex dependencies of multiple attributes in the state representation. We conduct a case study in a massive threat surveillance scenario and demonstrate the strengths of our model.
|
|
08:36-08:42, Paper WeAT2.2 | Add to My Program |
Human-Robot Collaboration for Unknown Flexible Surface Exploration and Treatment Based on Mesh Iterative Learning Control |
|
Xia, Jingkang | Southwest Jiaotong University, School of Electrical Engineering |
Dickwella Widanage, Kithmi Nima | University of Sussex |
Zhang, Ruiqing | Southwest Jiaotong University |
Parween, Rizuwana | University of Sussex |
Godaba, Hareesh | University of Sussex |
Herzig, Nicolas | University of Sussex |
Glovnea, Romeo | University of Sussex |
Huang, Deqing | Southwest Jiaotong University |
Li, Yanan | University of Sussex |
Keywords: Human-Robot Collaboration, Model Learning for Control, Robust/Adaptive Control
Abstract: Contact tooling operations like sanding and polishing have been high in demand for robotics and automation, as manual operations are labour-intensive with inconsistent quality. However, automating these operations remains a challenge since they are highly dependent on prior knowledge about the geometry of the workpiece. While several methods have been developed in existing research to automate the geometry learning process and adjust the contact force, human supervision is heavily required in the calibration of workpieces and the path planning of robot motion in such methods. Furthermore, the stiffness identification of the workpiece is not considered in most of these methods. This paper presents a human-robot collaboration (HRC) framework, which is able to perform surface exploration on an unknown object combining the operator's flexibility with the control precision of the robot. The operator moves the robot along the surface of the target object, and the robot recognizes the surface geometry and surface stiffness while exerting a desired contact force through control. For this purpose, a mesh iterative learning control (MILC) is developed to learn the surface stiffness, plan the exploration path, and adjust contact force through repetitive online correction based on HRC. The proof of learning convergence and the results of the simulation and experiments on a 7-DOF Sawyer robot platform illustrate the validity of the proposed method.
|
|
08:42-08:48, Paper WeAT2.3 | Add to My Program |
Projecting Robot Intentions through Visual Cues: Static vs. Dynamic Signaling |
|
Sonawani, Shubham | Arizona State University |
Zhou, Yifan | Arizona State University |
Ben Amor, Heni | Arizona State University |
Keywords: Virtual Reality and Interfaces, Human-Robot Collaboration, Human-Robot Teaming
Abstract: Augmented and mixed-reality techniques harbor a great potential for improving human-robot collaboration. Visual signals and cues may be projected to a human partner in order to explicitly communicate robot intentions and goals.However, it is unclear what type of signals support such a process and whether signals can be combined without adding additional cognitive stress to the partner. This paper focuses on identifying the effective types of visual signals and quantify their impact through empirical evaluations. In particular, the study compares static and dynamic visual signals within a collaborative object sorting task and assesses their ability to shape human behavior. Furthermore, an information-theoretic analysis is performed to numerically quantify the degree of information transfer between visual signals and human behavior. The results of a human subject experiment show that there are significant advantages to combining multiple visual signals within a single task, i.e., increased task efficiency and reduced cognitive load.
|
|
08:48-08:54, Paper WeAT2.4 | Add to My Program |
Robottheory Fitness: GoBot’s Engagement Edge for Spurring Physical Activity in Young Children |
|
Morales Mayoral, Rafael | Oregon State University |
Helmi, Ameer | Oregon State University |
Warren, Shel-Twon | University of Arkansas |
Logan, Samuel W. | Oregon State University |
Fitter, Naomi T. | Oregon State University |
Keywords: Robot Companions, Long term Interaction, Social HRI
Abstract: Children around the world are growing more sedentary over time, which leads to considerable accompanying wellness challenges. Pilot results from our research group have shown that robots may offer something different or better than other developmentally appropriate toys when it comes to motivating physical activity. However, the foundations of this work involved larger-group interactions in which it was difficult to tease apart potential causes of motion, or one-time sessions during which the impact of the robot may have been due to novelty. Accordingly, the work in this paper covers more controlled interactions focused on one robot and one child participant, in addition to considering interactions over longitudinal observation. We discuss the results of a deployment during which N = 8 participants interacted with our custom GoBot robot over two months of weekly sessions. Within each session, the child users experienced a teleoperated robot mode, a semi-autonomous robot mode, and a control condition during which the robot was present but inactive. Results showed that children tended to be more active when the robot was active and the teleoperated mode did not yield significantly different results than the semi-autonomous mode. These insights can guide future application of assistive robots in child motor interventions, in addition to informing how these robots can be equipped to assist busy human clinicians.
|
|
08:54-09:00, Paper WeAT2.5 | Add to My Program |
Implicit Projection: Improving Team Situation Awareness for Tacit Human-Robot Interaction Via Virtual Shadows |
|
Boateng, Andrew | Arizona State University |
Zhang, Wenlong | Arizona State University |
Zhang, Yu (Tony) | Arizona State University |
Keywords: Virtual Reality and Interfaces, Human-Robot Teaming, Design and Human Factors
Abstract: Fluent teaming is characterized by tacit interaction without explicit communication. Such interaction requires team situation awareness (TSA) to facilitate. However, existing approaches often rely on explicit communication (such as visual projection) to support TSA, resulting in a paradox. In this paper, we consider implicit projection (IP) to improve TSA for tacit human-robot interaction. IP minimizes interruption and can thus reduce the cognitive demand to maintain TSA in teaming. We introduce a novel process for achieving IP via virtual shadows (referred to as IPS). We compare our method with two baselines that use explicit projection to maintain TSA. Results via human factors studies demonstrate that IPS supports better TSA and significantly improves unsolicited human responsiveness to robots, a key feature of fluent teaming. Participants acknowledged robots implementing IPS more favorable as a teammate. Simultaneously, our results also demonstrate that IPS is comparable to, and sometimes better than, the best-performing baselines on information accuracy.
|
|
09:00-09:06, Paper WeAT2.6 | Add to My Program |
User Interactions and Negative Examples to Improve the Learning of Semantic Rules in a Cognitive Exercise Scenario |
|
Suárez-Hernández, Alejandro | CSIC-UPC |
Andriella, Antonio | Pal Robotics |
Torras, Carme | Csic - Upc |
Alenyŕ, Guillem | CSIC-UPC |
Keywords: Human-Robot Collaboration, Human-Robot Teaming, Social HRI
Abstract: Enabling a robot to perform new tasks is a complex endeavor, usually beyond the reach of non-technical users. For this reason, research efforts that aim at empowering end-users to teach robots new abilities using intuitive modes of interaction are valuable. In this article, we present INtuitive PROgramming 2 (INPRO2), a learning framework that allows inferring planning actions from demonstrations given by a human teacher. INPRO2 operates in an assistive scenario, in which the robot may learn from a healthcare professional (a therapist or caregiver) new cognitive exercises that can be later administered to patients with cognitive impairment. INPRO2 features significant improvements over previous work, namely: (1) exploitation of negative examples; (2) proactive interaction with the teacher to ask questions about the legality of certain movements; and (3) learning goals in addition to legal actions. Through simulations, we show the performance of different proactive strategies for gathering negative examples. Real-world experiments with human teachers and a TIAGo robot are also presented to qualitatively illustrate INPRO2.
|
|
09:06-09:12, Paper WeAT2.7 | Add to My Program |
Large Language Models As Zero-Shot Human Models for Human-Robot Interaction |
|
Zhang, Bowen | National University of Singapore |
Soh, Harold | National University of Singapore |
Keywords: Human-Robot Collaboration, Human-Centered Robotics, Cognitive Modeling
Abstract: Human models play a crucial role in human-robot interaction (HRI), enabling robots to consider the impact of their actions on humans and plan their behavior accordingly. However, crafting good human models is challenging; capturing context-dependent human behavior requires significant prior knowledge and/or large amounts of interaction data, both of which are difficult to obtain. In this work, we explore the potential of large-language models (LLMs) --- which have consumed vast amounts of human-generated text data --- to act as zero-shot human models for HRI. Our experiments on three social datasets yield promising results; the LLMs are able to achieve performance comparable to purpose-built models. That said, we also discuss current limitations, such as sensitivity to prompts and spatial/numerical reasoning mishaps. Based on our findings, we demonstrate how LLM-based human models can be integrated into a social robot's planning process and applied in HRI scenarios. Specifically, we present one case study on a simulated trust-based table-clearing task and replicate past results that relied on custom models. Next, we conduct a new robot utensil-passing experiment (n = 65) where preliminary results show that planning with a LLM-based human model can achieve gains over a basic myopic plan. In summary, our results show that LLMs offer a promising (but incomplete) approach to human modeling for HRI.
|
|
09:12-09:18, Paper WeAT2.8 | Add to My Program |
MPC-Based Human-Accompanying Control Strategy for Improving the Motion Coordination between the Target Person and the Robot |
|
Peng, Jianwei | University of Chinese Academy of Sciences |
Liao, Zhelin | Fujian Agriculture and Forestry University |
Yao, Hanchen | Fujian Institute of Research on the Structure of Matter, Chinese |
Su, Zefan | Fuzhou University |
Zeng, Yadan | Nanyang Technology University |
Dai, Houde | Haixi Institutes, Chinese Academy of Sciences |
Keywords: Human-Robot Collaboration, Service Robotics, Human-Robot Teaming
Abstract: Social robots have gained widespread attention for their potential to assist people in diverse domains, such as living assistance and logistics transportation. Human-accompanying, i.e., walking side-by-side with a person, is an expected and essential capability for social robots. However, due to the complexity of motion coordination between the target person and the mobile robot, the accompanying action is still unstable. In this study, we propose a human-accompanying control strategy to improve the motion coordination for better practicability of the human-accompanying robot. Our approach allows the robot to adapt to the motion variations of the target person and avoid obstacles while accompanying them. First, a human-robot interaction model based on the separation-bearing-orientation scheme is developed to ascertain the relative position and orientation between the robot and the target person. Then, a human-accompanying controller based on behavioral dynamics and model predictive control (MPC) is designed to avoid obstacles and simultaneously track the direction and velocity of the target person. Experimental results indicate that the proposed method can effectively achieve side-by-side accompanying by simultaneously controlling the relative position, direction, and velocity between the target person and robot.
|
|
09:18-09:24, Paper WeAT2.9 | Add to My Program |
Improved Inference of Human Intent by Combining Plan Recognition and Language Feedback |
|
Idrees, Ifrah | Brown University |
Yun, Tian | Brown University |
Sharma, Naveen | Humans to Robots Lab at Brown University |
Deng, Yunxin | Brown University |
Gopalan, Nakul | Arizona State University |
Tellex, Stefanie | Brown |
Konidaris, George | Brown University |
Keywords: Human-Robot Collaboration, Intention Recognition, Human-Centered Robotics
Abstract: Conversational assistive robots can aid people, especially those with cognitive impairments, to accomplish various tasks such as cooking meals, performing exercises, or operating machines. However, to interact with people effectively, robots must recognize human plans and goals from noisy observations of human actions, even when the user acts sub-optimally. Previous works on Plan and Goal Recognition (PGR) as planning have used hierarchical task networks (HTN) to model the actor/human. However, these techniques are insufficient as they do not have user engagement via natural modes of interaction such as language. Moreover, they have no mechanisms to let users, especially those with cognitive impairments, know of a deviation from their original plan or about any sub-optimal actions taken towards their goal. We propose a novel framework for plan and goal recognition in partially observable domains—Dialogue for Goal Recognition (D4GR) enabling a robot to rectify its belief in human progress by asking clarification questions about noisy sensor data and sub-optimal human actions. We evaluate the performance of D4GR over two simulated domains—kitchen and blocks domain. With language feedback and the world state information in a hierarchical task model, we show that D4GR framework for the highest sensor noise performs 1% better than HTN in goal accuracy in both domains. For plan accuracy, D4GR outperforms by 4% in the kitchen domain and 2% in the blocks domain in comparison to HTN. The ALWAYS-ASK oracle outperforms our policy by 3% in goal recognition and 7% in plan recognition. D4GR does so by asking 68% fewer questions than an oracle baseline. We also demonstrate a real-world robot scenario in the kitchen domain, validating the improved plan and goal recognition of D4GR in a realistic setting.
|
|
09:24-09:30, Paper WeAT2.10 | Add to My Program |
Online Human Capability Estimation through Reinforcement Learning and Interaction |
|
Sun, Chengke | University of Leeds |
Cohn, Anthony | University of Leeds |
Leonetti, Matteo | King's College London |
Keywords: Human-Robot Collaboration, Reinforcement Learning
Abstract: Service robots are expected to assist users in a constantly growing range of environments and tasks. People may be unique in many ways, and online adaptation of robots is central to personalized assistance. We focus on collaborative tasks in which the human collaborator may not be fully able-bodied, with the aim for the robot to automatically determine the best level of support. We propose a methodology for online adaptation based on Reinforcement Learning and Bayesian inference. As the Reinforcement Learning process continuously adjusts the robot's behavior, the actions that become part of the improved policy are used by the Bayesian inference module as local evidence of human capability, which can be generalized across the state space. The estimated capabilities are then used as pre-conditions to collaborative actions, so that the robot can quickly disable actions that the person seems unable to perform. We demonstrate and validate our approach on two simulated tasks and one real-world collaborative task across a range of motion and sensing capabilities.
|
|
09:30-09:36, Paper WeAT2.11 | Add to My Program |
Cognitive Approach to Hierarchical Task Selection for Human-Robot Interaction in Dynamic Environments |
|
Bukhari, Syed Tanweer Shah | University of Central Punjab |
Anima, Bashira Akter | University of Nevada, Reno |
Feil-Seifer, David | University of Nevada, Reno |
Qazi, Wajahat Mahmood | Intelligent Machines & Robotics Group, Department of Computer Sc |
Keywords: Human-Robot Collaboration, Cognitive Control Architectures, Human-Robot Teaming
Abstract: In an efficient and flexible human-robot collaborative work environment, a robot team member must be able to recognize both explicit requests and implied actions from human users. Identifying “what to do” in such cases requires an agent to have the ability to construct associations between objects, their actions, and the effect of actions on the environment. In this regard, semantic memory is being introduced to understand the explicit cues and their relationships with available objects and required skills to make “tea” and “sandwich”. We have extended our previous hierarchical robot control architecture to add the capability to execute the most appropriate task based on both feedback from the user and the environmental context. To validate this system, two types of skills were implemented in the hierarchical task tree: 1) Tea making skills and 2) Sandwich making skills. During the conversation between the robot and the human, the robot was able to determine the hidden context using ontology and began to act accordingly. For instance, if the person says “I am thirsty” or “It is cold outside” the robot will start to perform the tea-making skill. In contrast, if the person says, “I am hungry” or “I need something to eat”, the robot will make the sandwich. A humanoid robot Baxter was used for this experiment. We tested three scenarios with objects at different positions on the table for each skill. We observed that in all cases, the robot used only objects that were relevant to the skill.
|
|
09:36-09:42, Paper WeAT2.12 | Add to My Program |
Reward Shaping for Building Trustworthy Robots in Sequential Human-Robot Interaction |
|
Guo, Yaohui | University of Michigan, Ann Arbor |
Yang, X. Jessie | University of Michigan |
Shi, Cong | University of Miami |
Keywords: Human-Robot Teaming, Human-Centered Automation, Acceptability and Trust
Abstract: Trust-aware human-robot interaction (HRI) has received increasing research attention, as trust has been shown to be a crucial factor for effective HRI. Research in trust-aware HRI discovered a dilemma -- maximizing task rewards often leads to decreased human trust, while maximizing human trust would compromise task performance. In this work, we address this dilemma by formulating the HRI process as a two-player Markov game and utilizing the reward-shaping technique to improve human trust while limiting performance loss. Specifically, we show that when the shaping reward is potential-based, the performance loss can be bounded by the potential functions evaluated at the final states of the Markov game. We apply the proposed framework to the experience-based trust model, resulting in a linear program that can be efficiently solved and deployed in real-world applications. We evaluate the proposed framework in a simulation scenario where a human-robot team performs a search-and-rescue mission. The results demonstrate that the proposed framework successfully modifies the robot's optimal policy, enabling it to increase human trust at a minimal task performance cost.
|
|
09:42-09:48, Paper WeAT2.13 | Add to My Program |
Latent Emission-Augmented Perspective-Taking (LEAPT) for Human-Robot Interaction |
|
Chen, Kaiqi | National University of Singapore |
Lim, Jing Yu | National University of Singapore |
Kuan, Kingsley | National University of Singapore |
Soh, Harold | National University of Singapore |
Keywords: Human-Robot Collaboration, Probabilistic Inference, Representation Learning
Abstract: Perspective-taking is the ability to perceive or understand a situation or concept from another individual's point of view, and is crucial in daily human interactions. Enabling robots to perform perspective-taking remains an unsolved problem; existing approaches that use deterministic or handcrafted methods are unable to accurately account for uncertainty in partially-observable settings. This work proposes to address this limitation via a deep world model that enables a robot to perform both perception and conceptual perspective taking, i.e., the robot is able to infer what a human sees and believes. The key innovation is to leverage a decomposed multi-modal latent state space model to generate and augment fictitious observations. Optimizing the ELBO that arises from the underlying probabilistic graphical model enables the learning of uncertainty in latent space, which facilitates uncertainty estimation from high-dimensional observations. We tasked our model to predict human observations and beliefs on three partially-observable HRI tasks. Experiments show that our method significantly outperforms existing baselines and is able to infer visual observations available to other agent and their internal beliefs.
|
|
09:48-09:54, Paper WeAT2.14 | Add to My Program |
Robust and Context-Aware Real-Time Collaborative Robot Handling Via Dynamic Gesture Commands |
|
Chen, Rui | Carnegie Mellon University; University of Michigan; |
Shek, Alvin | Carnegie Mellon University |
Liu, Changliu | Carnegie Mellon University |
Keywords: Human-Robot Collaboration, Gesture, Posture and Facial Expressions, Learning from Demonstration
Abstract: This paper studies real-time collaborative robot (cobot) handling, where the cobot maneuvers an object under human dynamic gesture commands. Enabling dynamic gesture commands is useful when the human needs to avoid direct contact with the robot or the object handled by the robot. However, the key challenge lies in the heterogeneity in human behaviors and the stochasticity in the perception of dynamic gestures, which requires the robot handling policy to be adaptable and robust. To address these challenges, we introduce Conditional Collaborative Handling Process (CCHP) to encode a context-aware cobot handling policy and a procedure to learn such policy from human-human collaboration. We thoroughly evaluate the adaptability and robustness of CCHP and apply our approach to a real-time cobot assembly task with Kinova Gen3 robot arm. Results show that our method leads to significantly less human effort and smoother human-robot collaboration than state-of-the-art rule-based approach even with first-time users.
|
|
WeAT3 Regular session, 140C |
Add to My Program |
Field Robots I |
|
|
Chair: Ghaffari, Maani | University of Michigan |
Co-Chair: Driggs-Campbell, Katherine | University of Illinois at Urbana-Champaign |
|
08:30-08:36, Paper WeAT3.1 | Add to My Program |
X-MAS: Extremely Large-Scale Multi-Modal Sensor Dataset for Outdoor Surveillance in Real Environments |
|
Noh, DongKi | LG Electronics Inc |
Sung, Chang Ki | KAIST |
Uhm, Taeyoung | Korean Institute of Robotics and Technology Convergence |
Lee, Wooju | KAIST |
Lim, Hyungtae | Korea Advanced Institute of Science and Technology |
Choi, Jaeseok | Seoul National University |
Lee, Kyuewang | Seoul National University |
Hong, Dasol | KAIST |
Um, Daeho | Seoul National University |
Chung, Inseop | Seoul National University |
Shin, Hochul | Electronics and Telecommunications Research Institute |
Kim, Min-Jung | KAIST |
Kim, Hyoung-Rock | LG Electronics Co. Advanced Research Institute |
Baek, Seung-Min | LG Electronics |
Myung, Hyun | KAIST (Korea Advanced Institute of Science and Technology) |
Keywords: Surveillance Robotic Systems, Field Robots, Data Sets for Robot Learning
Abstract: In robotics and computer vision communities, extensive studies have been widely conducted regarding surveillance tasks, including human detection, tracking, and motion recognition with a camera. Additionally, deep learning algorithms are widely utilized in the aforementioned tasks as in other computer vision tasks. Existing public datasets are insufficient to develop learning-based methods that handle various surveillance for outdoor and extreme situations such as harsh weather and low illuminance conditions. Therefore, we introduce a new large-scale outdoor surveillance dataset named eXtremely large-scale Multi-modAl Sensor dataset (XMAS) containing more than 500,000 image pairs and the first-person view data annotated by well-trained annotators. Moreover, a single pair contains multi-modal data (e.g. an IR image, an RGB image, a thermal image, a depth image, and a LiDAR scan). This is the first large-scale first-person view outdoor multi-modal dataset focusing on surveillance tasks to the best of our knowledge. We present an overview of the proposed dataset with statistics and present methods of exploiting our dataset with deep learning-based algorithms. The latest information on the dataset and our study are available at https://github.com/lge-robot-navi, and the dataset will be available for download through a server.
|
|
08:36-08:42, Paper WeAT3.2 | Add to My Program |
Athletic Mobile Manipulator System for Robotic Wheelchair Tennis |
|
Zaidi, Zulfiqar | Georgia Institute of Technology |
Martin, Daniel | Georgia Institute of Technology |
Belles, Nathaniel | Georgia Institute of Technology |
Zakharov, Viacheslav | Georgia Institute of Technology |
Krishna, Arjun | Georgia Institute of Technology |
Lee, Kin Man | Georgia Institute of Technology |
Wagstaff, Peter | Georgia Institute of Technology |
Naik, Sumedh | Georgia Institute of Technology |
Sklar, Matthew | Georgia Institute of Technology |
Choi, Sugju | Georgia Institute of Technology |
Kakehi, Yoshiki | Georgia Institute of Technology |
Patil, Ruturaj | Georgia Institute of Technology |
Mallemadugula, Divya | Georgia Institute of Technology |
Pesce, Florian | Georgia Institute of Technology |
Wilson, Peter | Georgia Institute of Technology |
Hom, Wendell | Georgia Institute of Technology |
Diamond, Matan | Georgia Institute of Technology |
Zhao, Bryan | Georgia Institute of Technology |
Moorman, Nina | Georgia Institute of Technology |
Paleja, Rohan | Georgia Institute of Technology |
Chen, Letian | Georgia Institute of Technology |
Seraj, Esmaeil | Georgia Institute of Technology |
Gombolay, Matthew | Georgia Institute of Technology |
Keywords: Engineering for Robotic Systems
Abstract: Athletics are a quintessential and universal expression of humanity.From French monks who in the 12th century invented jeu de paume, the precursor to modern lawn tennis, back to the K'iche' people who played the Maya Ballgame as a form of religious expression over three thousand years ago, humans have sought to train their minds and bodies to excel in sporting contests. Advances in robotics are opening up the possibility of robots in sports. Yet, key challenges remain, as most prior works in robotics for sports are limited to pristine sensing environments, do not require significant force generation, or are on miniaturized scales unsuited for joint human-robot play. In this paper, we propose the first open-source, autonomous robot for playing regulation wheelchair tennis. We demonstrate the performance of our full-stack system in executing ground strokes and evaluate each of the system's hardware and software components. The goal of this paper is to (1) inspire more research in human-scale robot athletics and (2) establish the first baseline for a reproducible wheelchair tennis robot for regulation singles play. Our paper contributes to the science of systems design and poses a set of key challenges for the robotics community to address in striving towards robots that can match human capabilities in sports.
|
|
08:42-08:48, Paper WeAT3.3 | Add to My Program |
An Attentional Recurrent Neural Network for Occlusion-Aware Proactive Anomaly Detection in Field Robot Navigation |
|
Schreiber, Andre | University of Illinois Urbana-Champaign |
Ji, Tianchen | University of Illinois at Urbana-Champaign |
McPherson, D. Livingston | University of Illinois |
Driggs-Campbell, Katherine | University of Illinois at Urbana-Champaign |
Keywords: Robotics and Automation in Agriculture and Forestry, Sensor Fusion, Failure Detection and Recovery
Abstract: The use of mobile robots in unstructured environments like the agricultural field is becoming increasingly common. The ability for such field robots to proactively identify and avoid failures is thus crucial for ensuring efficiency and avoiding damage. However, the cluttered field environment introduces various sources of noise (such as sensor occlusions) that make proactive anomaly detection difficult. Existing approaches can show poor performance in sensor occlusion scenarios as they typically do not explicitly model occlusions and only leverage current sensory inputs. In this work, we present an attention-based recurrent neural network architecture for proactive anomaly detection that fuses current sensory inputs and planned control actions with a latent representation of prior robot state. We enhance our model with an explicitly-learned model of sensor occlusion that is used to modulate the use of our latent representation of prior robot state. Our method shows improved anomaly detection performance and enables mobile field robots to display increased resilience to predicting false positives regarding navigation failure during periods of sensor occlusion, particularly in cases where all sensors are briefly occluded. Our code is available at: https://github.com/andreschreiber/roar.
|
|
08:48-08:54, Paper WeAT3.4 | Add to My Program |
Collaborative Trolley Transportation System with Autonomous Nonholonomic Robots |
|
Xia, Bingyi | Southern University of Science and Technology |
Luan, Hao | National University of Singapore |
Zhao, Ziqi | Southern University of Science and Technology |
Gao, Xuheng | Southern University of Science and Technology |
Xie, Peijia | Southern University of Science and Technology |
Xiao, Anxing | National University of Singapore |
Wang, Jiankun | Southern University of Science and Technology |
Meng, Max Q.-H. | The Chinese University of Hong Kong |
Keywords: Automation Technologies for Smart Cities, Intelligent Transportation Systems, Multi-Robot Systems
Abstract: Cooperative object transportation using multiple robots has been intensively studied in the control and robotics literature, but most approaches are either only applicable to omnidirectional robots or lack a complete navigation and decision-making framework that operates in real time. This paper presents an autonomous nonholonomic multi-robot system and an end-to-end hierarchical autonomy framework for collaborative luggage trolley transportation. This framework finds kinematic-feasible paths, computes online motion plans, and provides feedback that enables the multi-robot system to handle long lines of luggage trolleys and navigate obstacles and pedestrians while dealing with multiple inherently complex and coupled constraints. We demonstrate the designed collaborative trolley transportation system through practical transportation tasks, and the experiment results reveal their effectiveness and reliability in complex and dynamic environments.
|
|
08:54-09:00, Paper WeAT3.5 | Add to My Program |
Deconfounded Opponent Intention Inference for Football Multi-Player Policy Learning |
|
Wang, Shijie | Institute of Automation, Chinese Academy of Sciences |
Pan, Yi | Institute of Automation, Chinese Academy of Sciences |
Pu, Zhiqiang | University of Chinese Academy of Sciences; Institute of Automati |
Liu, Boyin | University of Chinese Academy of Sciences School of Artificial I |
Yi, Jianqiang | Chinese Academy of Sciences |
Keywords: Agent-Based Systems, Reinforcement Learning, Probabilistic Inference
Abstract: Due to the high complexity of a football match, the opponents' strategies are variable and unknown. Thus predicting the opponents' future intentions accurately based on current situation is crucial for football players' decision-making. To better anticipate the opponents and learn more effective strategies, a deconfounded opponent intention inference (DOII) method for football multi-player policy learning is proposed in this paper. Specifically, opponents' intentions are inferred by an opponent intention supervising module. Furthermore, for some confounders which affect the causal relationship among the players and the opponents, a deconfounded trajectory graph module is designed to mitigate the influence of these confounders and increase the accuracy of the inferences about opponents' intentions. Besides, an opponent-based incentive module is designed to improve the players' sensitivity to the opponents' intentions and further to train reasonable players' strategies. Representative results indicate that DOII can effectively improve the performance of players' strategies in the Google Research Football environment, which validates the superiority of the proposed method.
|
|
09:00-09:06, Paper WeAT3.6 | Add to My Program |
Stroke-Based Rendering and Planning for Robotic Performance of Artistic Drawing |
|
Ilinkin, Ivaylo | Gettysburg College |
Song, Daeun | Ewha Womans University |
Kim, Young J. | Ewha Womans University |
Keywords: Art and Entertainment Robotics, Simulation and Animation, Virtual Reality and Interfaces
Abstract: We present a new robotic drawing system based on stroke-based rendering (SBR). Our motivation is the artistic quality of the whole performance. Not only should the generated strokes in the final drawing resemble the input image, but the stroke sequence should also exhibit a human artist’s planning process. Thus, when a robot executes the drawing task, both the drawing results and the way the robot executes would look artistic. Our SBR system is based on image segmentation and depth estimation. It generates the drawing strokes in an order that allows for the intended shape to be perceived quickly and for its detailed features to be filled in and emerge gradually when observed by the human. This ordering represents a stroke plan that the drawing robot should follow to create an artistic rendering of images. We experimentally demonstrate that our SBR-based drawing makes visually pleasing artistic images, and our robotic system can replicate the result with proper sequences of stroke drawing.
|
|
09:06-09:12, Paper WeAT3.7 | Add to My Program |
Heterogeneous Robot-Assisted Services in Isolation Wards: A System Development and Usability Study |
|
Kwon, Youngsun | Electronics and Telecommunications Research Institute |
Shin, Soyeon | LG Electronics |
Yang, Kyon-Mo | Korea Institute of Robot and Convergence |
Park, Seongah | Korea Institute of Science and Technology (KIST) |
Shin, Soomin | KIST |
Jeon, Hwawoo | Korea Institute of Science and Technology, and Hanyang Univ |
Kim, Kijung | Korea Institute of Science and Technology |
Yun, Guhnoo | Korea Institute of Science and Technology |
Park, Sang Yong | Korea Institute of Science and Technology |
Byun, Jeewon | Softnet |
Kang, Sang Hoon | Ulsan National Institute of Science and Technology(UNIST) / U. O |
Song, Kyoung-Ho | Seoul National University Bundang Hospital |
Kim, Doik | KIST |
Kim, Dong Hwan | Korea Institute of Science and Technology |
Seo, Kap-Ho | Korea Institute of Robot and Convergence |
Kwak, Sonya Sona | Korea Institute of Science and Technology (KIST) |
Lim, Yoonseob | Korea Institute of Science and Technology |
Keywords: Service Robotics, Product Design, Development and Prototyping, Software-Hardware Integration for Robot Systems
Abstract: Isolation wards operate in quarantine rooms to prevent cross-contamination caused by infectious diseases. Behind the benefits, medical personnel can have the infection risk from patients and the heavy workload due to the isolation. This work proposes a robot-assisted system to alleviate these problems in isolation wards. We conducted a survey about the medical staff's difficulties and envisioning robots. Using the investigation result, we devised three valuable services using two kinds of heterogeneous robots: telemedicine, emergency alert, and delivery services by care robots and delivery robots. Our system also provides user-interactive components such as a dashboard for medical staff and a patient app for inpatients. To manage the services efficiently, we suggest the robotic system based on a central control server and a hierarchical management architecture. Through a user study, we reviewed the usability of the developed system and its future directions.
|
|
09:12-09:18, Paper WeAT3.8 | Add to My Program |
Irregular Change Detection in Sparse Bi-Temporal Point Clouds Using Learned Place Recognition Descriptors and Point-To-Voxel Comparison |
|
Stathoulopoulos, Nikolaos | Luleĺ University of Technology, Robotics and AI Group |
Koval, Anton | Luleĺ University of Technology |
Nikolakopoulos, George | Luleĺ University of Technology |
Keywords: Field Robots, Mining Robotics, AI-Based Methods
Abstract: Change detection and irregular object extraction in 3D point clouds is a challenging task that is of high importance not only for autonomous navigation but also for updating existing digital twin models of various industrial environments. This article proposes an innovative approach for change detection in 3D point clouds using deep learned place recognition descriptors and irregular object extraction based on voxel-to-point comparison. The proposed method first aligns the bi-temporal point clouds using a map-merging algorithm in order to establish a common coordinate frame. Then, it utilizes deep learning techniques to extract robust and discriminative features from the 3D point cloud scans, which are used to detect changes between consecutive point cloud frames and therefore find the changed areas. Finally, the altered areas are sampled and compared between the two time instances to extract any obstructions that caused the area to change. The proposed method was successfully evaluated in real-world field experiments, where it was able to detect different types of changes in 3D point clouds, such as object or muck-pile addition and displacement, showcasing the effectiveness of the approach. The results of this study demonstrate important implications for various applications, including safety and security monitoring in construction sites, mapping and exploration and suggests potential future research directions in this field.
|
|
09:18-09:24, Paper WeAT3.9 | Add to My Program |
Magnetically Controlled Cell Robots with Immune-Enhancing Potential |
|
Sun, Hongyan | Beihang University |
Dai, Yuguo | Beihang University |
Zhang, Jiaying | Beihang University, School of Mechanical Engineering &Automation |
Xu, Junjie | Beijing University of Aeronautics and Astronautics |
Lina, Jia | BEIHANG UNIVERSITY |
Wang, Chutian | Beihang University |
Wang, Luyao | Beihang University |
Li, Chan | Beihang University |
Bai, Xue | School of Mechanical Engineering & Automation, Beihang Universit |
Chen, Bo | School of Mechanical Engineering & Automation, Beihang Universit |
Feng, Lin | Beihang University |
Keywords: Micro/Nano Robots, Cellular and Modular Robots, Field Robots
Abstract: Magnetic microrobots exhibit enormous potential in targeted drug delivery owing to the remote wireless manipulation and minimum invasion for medical treatment. High degree of freedom offers the magnetic propelled robots extraordinary application prospect since they can be controlled precisely when different magnetic fields sources working cooperatively. However, the biocompatibility of microrobots have attracted sustained and general concern. Therefore, it is highly necessary to develop a promising carrier with high biocompatibility and investigate the mechanism of drug loading-release triggered by special microenvironment in the targeted region. In this paper, we proposed a magnetically controlled cell robots (MCRs) based on macrophages propelled by a rotating magnetic field. The innovative MCRs exhibit good biocompatibility and low toxicity by optimizing the concentration of polylysine-coated Fe nanoparticles (PLL@FeNPs) to 40 μg/mL. These MCRs loaded with murine interleukin-12 (IL-12), murine chemokine (C-C motif) ligand 5 (CCL-5), and murine C-X-C motif chemokine ligand 10 (CXCL-10) which can stimulate T cell differentiation and recruitment of monocytes, respectively. The macrophages showed an obvious M1-polarization tendency of macrophages to phagocytose intracellular pathogens and resist the growth of tumor cells. Under the control of a magnetic propelling system composed of 3 pairs of Helmholtz coil, the cell robot can be propelled wirelessly and moved along a predefined path with high accuracy. Moreover, the MCRs could approach to cancer cells and stop at places of interest in vitro. In conclusion, we have accomplished the preliminary construction of a targeted drug delivery system which displays great immune-enhancing potential for targeted drug delivery.
|
|
09:24-09:30, Paper WeAT3.10 | Add to My Program |
Tightly-Coupled Visual-DVL Fusion for Accurate Localization of Underwater Robots |
|
Huang, Yupei | Institute of Automation, Chinese Academy of Sciences |
Li, Peng | Institute of Automation, Chinese Academy of Sciences |
Yan, Shuaizheng | Institute of Automation, Chinese Academy of Sciences |
Ou, Yaming | Chinese Academy of Sciences |
Wu, Zhengxing | Chinese Academy of Sciences |
Tan, Min | Institute of Automation, Chinese Academy of Sciences |
Yu, Junzhi | Chinese Academy of Sciences |
Keywords: Marine Robotics, Field Robots, Localization
Abstract: This paper proposes a tightly-coupled visual-Doppler-Velocity-Log (visual-DVL) fusion method for underwater robot localization through integrating the velocity measurements from a DVL into a visual odometry (VO). Considering that employing the DVL measurements in dead-reckoning systems easily leads to error accumulation and suboptimal results in previous works, we directly integrate them into the visual tracking process. Specifically, the velocity measurements are utilized to improve the initial estimation of camera pose during visual tracking, aiming to provide a better initial value for pose optimization. Thereafter, these velocity measurements are also directly employed to constrain the position change of the camera between two adjacent frames by constructing a novel DVL error term, which is optimized jointly with the visual constrains to obtain a more accurate camera pose. Various experiments are carried out in the datasets collected from several scenarios of the underwater simulation environment HoloOcean, and the results illustrate that the proposed fusion method can effectively improve the localization accuracy for underwater robots by about 20% compared to pure visual odometry. The proposed method provides valuable guidance for the accurate localization of underwater robots.
|
|
09:30-09:36, Paper WeAT3.11 | Add to My Program |
Fully Proprioceptive Slip-Velocity-Aware State Estimation for Mobile Robots Via Invariant Kalman Filtering and Disturbance Observer |
|
Yu, Xihang | University of Michigan |
Teng, Sangli | University of Michigan, Ann Arbor |
Chakhachiro, Theodor | American University of Beirut |
Tong, Wenzhe | University of Michigan, Ann Arbor |
Li, Tingjun | University of Michigan |
Lin, Tzu-Yuan | University of Michigan |
Koehler, Sarah | University of California, Berkeley |
Ahumada, Manuel | Toyota Research Institute |
Walls, Jeffrey | University of Michigan |
Ghaffari, Maani | University of Michigan |
Keywords: Wheeled Robots, Localization, Field Robots
Abstract: This paper develops a novel slip estimator using the invariant observer design theory and Disturbance Observer (DOB). The proposed state estimator for mobile robots is fully proprioceptive and combines data from an inertial measurement unit and body velocity within a Right Invariant Extended Kalman Filter (RI-EKF). By embedding the slip velocity into SE3(3) matrix Lie group, the developed DOB-based RI-EKF provides real-time velocity and slip velocity estimates on different terrains. Experimental results using a Husky wheeled robot confirm the mathematical derivations and effectiveness of the proposed method in estimating the observable state variables. Open-source software is available for download and reproducing the presented results.
|
|
09:36-09:42, Paper WeAT3.12 | Add to My Program |
Predicting Energy Consumption and Traversal Time of Ground Robots for Outdoor Navigation on Multiple Types of Terrain |
|
Eder, Matthias | Graz University of Technology |
Steinbauer-Wagner, Gerald | Graz University of Technology |
Keywords: Field Robots, Autonomous Vehicle Navigation, Energy and Environment-Aware Automation
Abstract: The outdoor navigation capabilities of ground robots have improved significantly in recent years, opening up new potential applications in a variety of settings. Cost-based representations of the environment are frequently used in the path planning domain to obtain an optimized path based on various objectives, such as traversal time or energy consumption. However, obtaining such cost representations is still cumbersome, particularly in outdoor settings with diverse terrain types and slope angles. In this paper, we address this problem by using a data-driven approach to develop a cost representation for various outdoor terrain types that supports two optimization objectives, namely energy consumption and traversal time. We train a supervised machine learning model whose inputs consists of extracted environment data along a path and whose outputs are the predicted energy consumption and traversal time. The model is based on a ResNet neural network architecture and trained using field-recorded data. The error of the proposed method on different types of terrain is within 11% of the ground truth data. To show that it performs and generalizes better than currently existing approaches on various types of terrain, a comparison to a baseline method is made.
|
|
09:42-09:48, Paper WeAT3.13 | Add to My Program |
Informative Path Planning for Scalar Dynamic Reconstruction Using Coregionalized Gaussian Processes and a Spatiotemporal Kernel |
|
Booth, Lorenzo A. | University of California Merced |
Carpin, Stefano | University of California, Merced |
Keywords: Agricultural Automation, Robotics and Automation in Agriculture and Forestry, Planning, Scheduling and Coordination
Abstract: The proliferation of unmanned vehicles offers many opportunities for solving environmental sampling tasks with applications in resource monitoring and precision agriculture. Informative path planning (IPP) includes a family of methods which offer improvements over traditional surveying techniques for suggesting locations for observation collection. In this work, we present a novel solution to the IPP problem by using a coregionalized Gaussian processes to estimate a dynamic scalar field that varies in space and time. Our method improves previous approaches by using a composite kernel accounting for spatiotemporal correlations and at the same time, can be readily incorporated in existing IPP algorithms. Through extensive simulations, we show that our novel modeling approach leads to more accurate estimations when compared with formerly proposed methods that do not account for the temporal dimension.
|
|
WeAT4 Regular session, 140D |
Add to My Program |
Optimization and Optimal Control I |
|
|
Chair: Ghaffari, Azad | Wayne State University |
Co-Chair: Han, Weiqiao | Massachusetts Institute of Technology |
|
08:30-08:36, Paper WeAT4.1 | Add to My Program |
Probabilistic Guarantees for Nonlinear Safety-Critical Optimal Control |
|
Akella, Prithvi | California Institute of Technology |
Ubellacker, Wyatt | California Institute of Technology |
Ames, Aaron | Caltech |
Keywords: Probability and Statistical Methods, Optimization and Optimal Control, Robot Safety
Abstract: Leveraging recent developments in black-box risk-aware verification, we provide three algorithms that generate probabilistic guarantees on (1) optimality of solutions, (2) recursive feasibility, and (3) maximum controller runtimes for general nonlinear safety-critical finite-time optimal controllers. These methods forego the usual (perhaps) restrictive assumptions required for typical theoretical guarantees, e.g. terminal set calculation for recursive feasibility in Nonlinear Model Predictive Control, or convexification of optimal controllers to ensure optimality. Furthermore, we show that these methods can directly be applied to hardware systems to generate controller guarantees on their respective systems.
|
|
08:36-08:42, Paper WeAT4.2 | Add to My Program |
Learning from Human Directional Corrections (I) |
|
Jin, Wanxin | Arizona State University |
Murphey, Todd | Northwestern University |
Lu, Zehui | Purdue University |
Mou, Shaoshuai | Purdue University |
Keywords: Optimization and Optimal Control, Physical Human-Robot Interaction, Motion and Path Planning, Human Factors and Human-in-the-Loop
Abstract: This paper proposes a novel approach that enables a robot to learn an objective function incrementally from human directional corrections. Existing methods learn from human magnitude corrections; since a human needs to carefully choose the magnitude of each correction, those methods can easily lead to over-corrections and learning inefficiency. The proposed method only requires human directional corrections --- corrections that only indicate the direction of an input change without indicating its magnitude. We only assume that each correction, regardless of its magnitude, points in a direction that improves the robot current motion relative to an implicit objective function. For each directional correction, the proposed method updates the estimate of the objective function based on a cutting plane method, which has a geometric interpretation. We have established theoretical results to show the convergence of the learning process. The proposed method has been tested in numerical examples, a user study on two human-robot games, and a real-world quadrotor experiment.
|
|
08:42-08:48, Paper WeAT4.3 | Add to My Program |
Non-Gaussian Uncertainty Minimization Based Control of Stochastic Nonlinear Robotic Systems |
|
Han, Weiqiao | Massachusetts Institute of Technology |
M. Jasour, Ashkan | MIT |
Williams, Brian | MIT |
Keywords: Optimization and Optimal Control, Motion and Path Planning, Probability and Statistical Methods
Abstract: In this paper, we consider the closed-loop control problem of nonlinear robotic systems in the presence of probabilistic uncertainties and disturbances. More precisely, we design a state feedback controller that minimizes deviations of the states of the system from the nominal state trajectories due to uncertainties and disturbances. Existing approaches to address the control problem of probabilistic systems are limited to particular classes of uncertainties and systems such as Gaussian uncertainties and processes and linearized systems. We present an approach that deals with nonlinear dynamics models and arbitrary known probabilistic uncertainties. We formulate the controller design problem as an optimization problem in terms of statistics of the probability distributions including moments and characteristic functions. In particular, in the provided optimization problem, we use moments and characteristic functions to propagate uncertainties throughout the nonlinear motion model of robotic systems. In order to reduce the tracking deviations, we minimize the uncertainty of the probabilistic states around the nominal trajectory by minimizing the trace and the determinant of the covariance matrix of the probabilistic states. To obtain the state feedback gains, we solve deterministic optimization problems in terms of moments, characteristic functions, and state feedback gains using off-the-shelf interior-point optimization solvers. To illustrate the performance of the proposed method, we compare our method with existing probabilistic control methods.
|
|
08:48-08:54, Paper WeAT4.4 | Add to My Program |
Learning Compliant Stiffness by Impedance Control Aware Task Segmentation and Multi-Objective Bayesian Optimization with Priors |
|
Okada, Masashi | Panasonic Holdings Corporation |
Komatsu, Mayumi | Panasonic Corp |
Okumura, Ryo | Panasonic Holdings Corporation |
Taniguchi, Tadahiro | Ritsumeikan University |
Keywords: Optimization and Optimal Control, Compliance and Impedance Control, Learning from Demonstration
Abstract: Rather than traditional position control, impedance control is preferred to ensure the safe operation of industrial robots programmed from demonstrations. However, variable stiffness learning studies have focused on task performance rather than safety (or compliance). Thus, this paper proposes a novel stiffness learning method to satisfy both task performance and compliance requirements. The proposed method optimizes the task and compliance objectives (T/C objectives) simultaneously via multi-objective Bayesian optimization. We define the stiffness search space by segmenting a demonstration into task phases, each with constant responsible stiffness. The segmentation is performed by identifying impedance control-aware switching linear dynamics (IC-SLD) from the demonstration. We also utilize the stiffness obtained by proposed IC-SLD as priors for efficient optimization. Experiments on simulated tasks and a real robot demonstrate that IC-SLD-based segmentation and the use of priors improve the optimization efficiency compared to existing baseline methods.
|
|
08:54-09:00, Paper WeAT4.5 | Add to My Program |
Efficient Exploration Using Extra Safety Budget in Constrained Policy Optimization |
|
Xu, Haotian | Tsinghua University |
Wang, Shengjie | Tsinghua University |
Wang, Zhaolei | Beijing Aerospace Automatic Control Institute |
Zhang, Yunzhe | Tsinghua University |
Zhuo, Qing | TSINGHUA University |
Gao, Yang | Tsinghua University |
Zhang, Tao | Tsinghua University |
Keywords: Optimization and Optimal Control, Reinforcement Learning, Motion and Path Planning
Abstract: Reinforcement learning (RL) has achieved promising results on most robotic control tasks. Safety of learning-based controllers is an essential notion of ensuring the effectiveness of the controllers. Current methods adopt whole consistency constraints during the training, thus resulting in inefficient exploration in the early stage. In this paper, we propose an algorithm named Constrained Policy Optimization with Extra Safety Budget (ESB-CPO) to strike a balance between the exploration efficiency and the constraints satisfaction. In the early stage, our method loosens the practical constraints of unsafe transitions (adding extra safety budget) with the aid of a new metric we propose. With the training process, the constraints in our optimization problem become tighter. Meanwhile, theoretical analysis and practical experiments demonstrate that our method gradually meets the cost limit’s demand in the final training stage. When evaluated on Safety-Gym and Bullet-Safety-Gym benchmarks, our method has shown its advantages over baseline algorithms in terms of safety and optimality. Remarkably, our method gains remarkable performance improvement under the same cost limit compared with baselines.
|
|
09:00-09:06, Paper WeAT4.6 | Add to My Program |
Single-Level Differentiable Contact Simulation |
|
Le Cleac'h, Simon | Stanford University |
Schwager, Mac | Stanford University |
Manchester, Zachary | Carnegie Mellon University |
Sindhwani, Vikas | Google Brain, NYC |
Florence, Peter | MIT |
Singh, Sumeet | Google |
Keywords: Optimization and Optimal Control, Simulation and Animation, Dynamics
Abstract: We present a differentiable formulation of rigid-body contact dynamics for objects and robots represented as compositions of convex primitives. Existing optimization-based approaches simulating contact between convex primitives rely on a bilevel formulation that separates collision detection and contact simulation. These approaches are unreliable in realistic contact simulation scenarios because isolating the collision detection problem introduces contact location non-uniqueness. Our approach combines contact simulation and collision detection into a unified single-level optimization problem. This disambiguates the collision detection problem in a physics-informed manner. Our formulation features improved simulation robustness and a reduction in computational complexity when compared to a similar differentiable simulation baseline. We illustrate the contact and collision differentiability on a robotic manipulation task requiring optimization-through-contact. We provide a numerically efficient implementation of our formulation in the Julia language called Silico.jl.
|
|
09:06-09:12, Paper WeAT4.7 | Add to My Program |
Quadratic Dynamic Matrix Control for Fast Cloth Manipulation |
|
Caldarelli, Edoardo | Institut De Robňtica I Informŕtica Industrial (CSIC-UPC) |
Colomé, Adriŕ | Institut De Robňtica I Informŕtica Industrial (CSIC-UPC), Q28180 |
Ocampo-Martinez, Carlos | Universitat Politčcnica De Catalunya - BarcelonaTECH (UPC) |
Torras, Carme | Csic - Upc |
Keywords: Optimization and Optimal Control, Motion Control, Probability and Statistical Methods
Abstract: Robotic cloth manipulation is an increasingly relevant area of research, challenging classic control algorithms due to the deformable nature of cloth. While it is possible to apply linear model predictive control to make the robot move the cloth according to a given reference, this approach suffers from a large dimensionality of the state-space representation of the cloth models. To address this issue, in this work we study the application of an input-output model predictive control strategy, based on quadratic dynamic matrix control, to robotic cloth manipulation. To account for uncertain disturbances on the cloth's motion, we further extend the algorithm with suitable chance constraints. In extensive simulated experiments, involving disturbances and obstacle avoidance, we show that quadratic dynamic matrix control can be successfully applied in different cloth manipulation scenarios, with significant gains in optimization speed compared to standard model predictive control strategies. The experiments further demonstrate that the closed-loop model used by quadratic dynamic matrix control can be beneficial to the tracking accuracy, leading to improvements over the standard predictive control strategy. Moreover, a preliminary experiment on a real robot shows that quadratic dynamic matrix control can indeed be employed in real settings.
|
|
09:12-09:18, Paper WeAT4.8 | Add to My Program |
A Gaussian Process Model for Opponent Prediction in Autonomous Racing |
|
Zhu, Edward | University of California, Berkeley |
Busch, Finn Lukas | Hamburg University of Technology |
Johnson, Jake | University of California, Berkeley |
Borrelli, Francesco | University of California, Berkeley |
Keywords: Probabilistic Inference, Optimization and Optimal Control, Machine Learning for Robot Control
Abstract: In head-to-head racing, performing tightly constrained, but highly rewarding maneuvers, such as overtaking, require an accurate model of interactive behavior of the opposing target vehicle (TV). We propose to construct a prediction model given data of the TV from previous races. In particular, a one-step Gaussian process (GP) model is trained on closed-loop interaction data to learn the behavior of a TV driven by an unknown policy. Predictions of the nominal trajectory and associated uncertainty are rolled out via a sampling-based approach and are used in a model predictive control (MPC) policy for the ego vehicle in order to intelligently trade-off between safety and performance when racing against a TV. In a Monte Carlo study, we compare the GP-based predictor in closed-loop with the MPC policy against several predictors from literature and observe that the GP-based predictor achieves similar win rates while maintaining safety in up to 3x more races. Through experiments, we demonstrate the approach in real-time on a 1/10th scale racecar platform operating at speeds of around 2.8 m/s, and show a significant level of improvement when using the GP-based predictor over a baseline MPC predictor. Videos of the experiments can be found at https://youtu.be/KMSs4ofDfIs.
|
|
09:18-09:24, Paper WeAT4.9 | Add to My Program |
Optimal Energy Tank Initialization for Minimum Sensitivity to Model Uncertainties |
|
Pupa, Andrea | University of Modena and Reggio Emilia |
Robuffo Giordano, Paolo | Irisa Cnrs Umr6074 |
Secchi, Cristian | Univ. of Modena & Reggio Emilia |
Keywords: Robust/Adaptive Control, Optimization and Optimal Control, Motion Control
Abstract: Energy tanks have gained popularity inside the robotics and control communities over the last years, since they represent a formidable tool to enforce passivity (and, thus, input/output stability) of a controlled robot, possibly interacting with uncertain environments. One weak point of passification strategies based on energy tanks concerns, however, their initialization. Indeed, a too large initial energy can cause practical unstable behaviors, while a too low initial energy level can prevent the correct execution of the task. This shortcoming becomes even more relevant in presence of uncertainties in the robot model and/or environment, since it may be hard to predict in advance the correct (safe) amount of initial tank energy for a successful task execution. In this paper we then propose a new strategy for addressing this issue. The recent notion of closed-loop state sensitivity is exploited to derive precise bounds (tubes) on the tank energy behavior by assuming parametric uncertainty in the robot model. These tubes are then exploited in a novel nonlinear optimization problem aiming at finding both the best trajectory and the minimal initial tank energy that allow executing a positioning task for any value of the uncertain parameters in a given range. The approach is finally validated via a statistical analysis in simulation and experiments on real robot hardware.
|
|
09:24-09:30, Paper WeAT4.10 | Add to My Program |
Auxiliary Control to Avoid Undesirable Equilibria in Constrained Quadratic Programs for Trajectory Tracking Applications (I) |
|
Desai, Manavendra | Wayne State University |
Ghaffari, Azad | Wayne State University |
Keywords: Optimization and Optimal Control, Motion Control, Collision Avoidance, Wheeled Robots
Abstract: Control Lyapunov function (CLF) and control barrier function (CBF) based quadratic programs (QPs) may create undesirable local equilibria in control systems. One recent solution utilizes a rotating non-radial CLF to avoid such equilibria in regulation applications. To track trajectories, a nominal control can be incorporated into the QP to improve tracking performance of the CLF-CBF-QP. However, the direction of the steepest descent curve of the CLF can differ from that of the nominal control, which may compromise the tracking performance. Moreover, the design of a CLF is system-specific and generally not easy to realize. This work proposes a Tracking-CBF-QP formulation, where a nominal tracking control is incorporated in the cost function of a CBF-based QP. If the nominal control conflicts with the CBF condition, undesirable local equilibria may be induced in the closed-loop system. This work theoretically investigates the occurrence of such undesirable local equilibria and provides an auxiliary control approach to avoid such equilibria. The proposed Tracking-CBF-QP is used to design a novel leader-follower algorithm, and its effectiveness is verified experimentally.
|
|
09:30-09:36, Paper WeAT4.11 | Add to My Program |
A Data-Driven Approach to Synthesizing Dynamics-Aware Trajectories for Underactuated Robotic Systems |
|
Srikanthan, Anusha | University of Pennsylvania |
Yang, Fengjun | University of Pennsylvania |
Spasojevic, Igor | University of Pennsylvania |
Thakur, Dinesh | University of Pennsylvania |
Kumar, Vijay | University of Pennsylvania |
Matni, Nikolai | University of Pennsylvania |
Keywords: Optimization and Optimal Control, Dynamics, Underactuated Robots
Abstract: We consider joint trajectory generation and tracking control for under-actuated robotic systems. A common solution is to use a layered control architecture, where the top layer uses a simplified model of system dynamics for trajectory generation, and the low layer ensures approximate tracking of this trajectory via feedback control. While such layered control architectures are standard and work well in practice, selecting the simplified model used for trajectory generation typically relies on engineering intuition and experience. In this paper, we propose an alternative data-driven approach to dynamics-aware trajectory generation. We show that a suitable augmented Lagrangian reformulation of a global nonlinear optimal control problem results in a layered decomposition of the overall problem into trajectory planning and feedback control layers. Crucially, the resulting trajectory optimization is dynamics-aware, in that, it is modified with a tracking penalty regularizer encoding the dynamic feasibility of the generated trajectory. We show that this tracking penalty regularizer can be learned from system rollouts for independently-designed low layer feedback control policies, and instantiate our framework in the context of a unicycle and a quadrotor control problem in simulation. Further, we show that our approach handles the sim-to-real gap through experiments on the quadrotor hardware platform without any additional training. For both the synthetic unicycle example and the quadrotor system, our framework shows significant improvements in both computation time and dynamic feasibility in simulation and hardware experiments.
|
|
09:36-09:42, Paper WeAT4.12 | Add to My Program |
Time-Optimal Control Via Heaviside Step-Function Approximation |
|
Pfeiffer, Kai | School of Mechanical and Aerospace Engineering, Nanyang Technolo |
Pham, Quang-Cuong | NTU Singapore |
Keywords: Optimization and Optimal Control, Motion and Path Planning, Motion Control
Abstract: Least-squares programming is a popular tool in robotics due to its simplicity and availability of open-source solvers. However, certain problems like sparse programming in the L0- or L1-norm for time-optimal control are not equivalently solvable. In this work, we propose a non-linear hierarchical least-squares programming (NL-HLSP) for time-optimal control of non-linear discrete dynamic systems. We use a continuous approximation of the heaviside step function with an additional term that avoids vanishing gradients. We use a simple discretization method by keeping states and controls piece-wise constant between discretization steps. This way, we obtain a comparatively easily implementable NL-HLSP in contrast to direct transcription approaches of optimal control. We show that the NL-HLSP indeed recovers the discrete time-optimal control in the limit for resting goal points. We confirm the results in simulation for linear and non-linear control scenarios.
|
|
WeAT5 Regular session, 140E |
Add to My Program |
Manufacturing and Logistics I |
|
|
Chair: Elara, Mohan Rajesh | Singapore University of Technology and Design |
Co-Chair: Al Janaideh, Mohammad | University of Guelph |
|
08:30-08:36, Paper WeAT5.1 | Add to My Program |
Magnetically-Assisted Microfluidic Printing for the Fabrication of Anisotropic Skeletal Muscle Structure |
|
Wei, Zihou | Beijing Institute of Technology |
Yu, Xiao | Beijing Institute of Technology |
Chen, Shuibin | Beijing Institute of Technology |
Cong, Rong | Beijing Institute of Technology |
Wang, Huaping | Beijig Institute of Technology |
Shi, Qing | Beijing Institute of Technology |
Fukuda, Toshio | Nagoya University |
Sun, Tao | Beijing Institute of Technology |
Keywords: Additive Manufacturing, Biological Cell Manipulation, Biomimetics
Abstract: Microfluidic printing provides a novel tool to facilitate the bulk assembly of cell-aligned microfibers for the fabrication of artificial skeletal muscle structure. However, the anisotropic assembly is still difficult to be achieved due to the poor controllability for microfiber assembly position. In this paper, we developed a magnetic-assisted microfluidic printing system to solve this problem. Magnetic nanoparticles (MNPs) were encapsulated in the microfluidic spun microfibers, and a spiral magnet was designed as microfiber deposition substrate. The robotic manipulation system based visual servoing facilitates the relative movement between microfluidic spinning orifice and the axis direction of the spiral magnet, and the microfiber can be continuously deposited layer by layer to form anisotropic assembly structure. Our proposed method demonstrates that magnetic attraction mechanism is a novel microfiber-specific micromanipulation strategy for stable deposition in the solution environment. Furthermore, the preliminary cell experiments show that our method has high biocompatibility to be further used in the fabrication of anisotropic skeletal muscle structure.
|
|
08:36-08:42, Paper WeAT5.2 | Add to My Program |
Bi-Component Silicone 3D Printing with Dynamic Mix Ratio Modification for Soft Robotic Actuators |
|
Parilusyan, Brice | Léonard De Vinci Pôle Universitaire , Research Center |
Nicolae, Alina-Elena | CY Tech - CY Cergy Paris Université |
Batigne, Thomas | Lynxter |
Duhart, Clément | Léonard De Vinci Pôle Universitaire, Research Center, 92 916 Par |
Serrano, Marcos | IRIT - University of Toulouse |
Keywords: Additive Manufacturing, Soft Sensors and Actuators, Flexible Robotics
Abstract: Pneumatically operated soft actuators are increasingly researched due to their fabrication simplicity, actuation capabilities, and low production cost. Depending on the Soft Pneumatic Actuator (SPA) objective, its design can be modified to reach new bending angles or increase its actuation strength. However, increasing the abilities of Soft Pneumatic Actuators (SPAs) requires increasing the complexity of their air cavities or using multiple materials with different mechanical stiffness. Both solutions complexify the fabrication of SPAs, reducing their primary benefits of manufacturing simplicity and low production cost. This paper presents a novel additive manufacturing fabrication process incorporating multiple mechanical stiffnesses using a single bi-component soft material. This process aims to integrate multiple bending angles with multi-channel SPAs without increasing their manufacturing complexity. Our process uses a dynamic modification of the bicomponent silicone mix ratio to generate the desired mechanical properties of the material. Modifying the mix ratio allows us to control the material’s cure time and mechanical properties, such as its final stiffness. We found that using a single 30 shore-A bi-component silicone, we could achieve several stiffness values with different reticulation times and levels of stickiness. Using these shore ranges and our fabrication process, we built several SPAs. We explored how the printing orientation of the SPAs modifies its bending actuation using our fabrication process to illustrate the capabilities of our approach.
|
|
08:42-08:48, Paper WeAT5.3 | Add to My Program |
Two-Stage Train Components Defect Detection Based on Prior Knowledge |
|
Peng, Gang | Huazhong University of Science and Technology |
Li, Zhiyong | Huazhong University of Science and Technology |
Wan, Shaowei | Huawei Technologies |
Deng, Zhang | Wuhan Lisai Technology Co., Ltd |
Keywords: Industrial Robots, Engineering for Robotic Systems, Computer Vision for Automation
Abstract: The existing method of detecting defects in train components, which relies on visual identification, requires extensive involvement from inspectors and presents certain limitations. In this study, a two-stage defect detection based on prior knowledge was developed, which first detects the types and positions of components, and then conducts targeted detection of possible existing defect types. The algorithm introduces the prior knowledge of the relative spatial position relationship of components and optimizes the detection of sub-components by cascaded convolutional neural networks and local scale-up. In this study, three methods were used, including deep learning, template matching, and quantitative evaluation based on prior knowledge, to perform targeted detection of defect types that may occur in components. Experiments have verified the adaptability and accuracy of the method, demonstrating its high value for engineering applications.
|
|
08:48-08:54, Paper WeAT5.4 | Add to My Program |
Complete Coverage Path Planning for Omnidirectional Expand and Collapse Robot Panthera |
|
Lim, Yi | Singapore University of Technology and Design |
Wan, Ash Yaw Sang | Singapore University of Technology and Design |
Hayat, Abdullah Aamir | Singapore University of Technology and Design |
Qinrui, Tang | SUTD |
Le, Anh Vu | Communication and Signal Processing Research Group Faculty of El |
Elara, Mohan Rajesh | Singapore University of Technology and Design |
Keywords: Robotics and Automation in Construction, Service Robotics
Abstract: Autonomous mobile robots (AMRs) face challenges in efficiently covering complex environments. To navigate narrow and expansive areas, AMRs must have two essential attributes: compact size for confined spaces and larger size with omnidirectional locomotion for broader spaces. This study utilizes Omnidirectional expand and collapse self-Reconfigurable Robots (OECRs) to demonstrate efficient area coverage. OECRs can compress to navigate through confined spaces and expand for efficient coverage in broad spaces. However, current complete coverage path planning (CCPP) methods do not account for the expanded and compressed states of OECRs. To address this, a depth-first search (DFS) approach is proposed for OECRs' CCPP, which can adjust the robotic footprint along the CCPP path to reduce path length. The proposed DFS outperforms the state-of-the-art CCPP in terms of increased area coverage and reduced distance traveled on a selected map.
|
|
08:54-09:00, Paper WeAT5.5 | Add to My Program |
Monte-Carlo Tree Search with Prioritized Node Expansion for Multi-Goal Task Planning |
|
Pfeiffer, Kai | School of Mechanical and Aerospace Engineering, Nanyang Technolo |
Edgar, Leonardo | Nanyang Technological University, Singapore |
Pham, Quang-Cuong | NTU Singapore |
Keywords: Factory Automation, Motion and Path Planning, Task Planning
Abstract: Symbolic task planning for robots is computationally challenging due to the combinatorial complexity of the possible action space. This fact is amplified if there are several sub-goals to be achieved due to the increased length of the action sequences. In this work, we propose a multi-goal symbolic task planner for deterministic decision processes based on Monte Carlo Tree Search. We augment the algorithm by prioritized node expansion which prioritizes nodes that already have fulfilled some sub-goals. Due to its linear complexity in the number of sub-goals, our algorithm is able to identify symbolic action sequences of 145 elements to reach the desired goal state with up to 48 sub-goals while the search tree is limited to under 6500 nodes. We use action reduction based on a kinematic reachability criterion to further ease computational complexity. We combine our algorithm with object localization and motion planning and apply it to a real-robot demonstration with two manipulators in an industrial bearing inspection setting.
|
|
09:00-09:06, Paper WeAT5.6 | Add to My Program |
Efficient and Feasible Robotic Assembly Sequence Planning Via Graph Representation Learning |
|
Atad, Matan | Technical University of Munich |
Feng, Jianxiang | Institute of Robotics and Mechatronics, German Aerospace Center |
Rodriguez, Ismael | German Aerospace Center (DLR) |
Durner, Maximilian | German Aerospace Center DLR |
Triebel, Rudolph | German Aerospace Center (DLR) |
Keywords: Assembly, Deep Learning Methods, Intelligent and Flexible Manufacturing
Abstract: Automatic Robotic Assembly Sequence Planning (RASP) can significantly improve productivity and resilience in modern manufacturing along with the growing need for greater product customization. One of the main challenges in realizing such automation resides in searching for solutions from a growing number of potential sequences for increasingly complex assemblies and conducting costly feasibility checks for the robotic system effectively and efficiently. To address this, we propose a holistic graphical approach including a graph representation called Assembly Graph for product assemblies and a policy architecture, Graph Assembly Processing Network, dubbed GRACE for assembly sequence generation. With GRACE, we are able to extract meaningful information from the graph input and predict assembly sequences in a step-by-step manner. In experiments, we show that our approach can predict feasible assembly sequences across product variants of aluminum profiles based on data collected in simulation of a dual-armed robotic system. We further demonstrate that our method is capable of detecting infeasible assemblies, substantially alleviating the undesirable impacts from false predictions, and hence facilitating real-world deployment soon. Code and training data will be made publicly available.
|
|
09:06-09:12, Paper WeAT5.7 | Add to My Program |
Statistical Characterization of Position-Dependent Behavior Using Frequency-Aware B-Spline |
|
Al-Rawashdeh, Yazan | Memorial University of Newfoundland |
Heertjes, Marcel | Eindhoven University of Technology |
Al Janaideh, Mohammad | University of Guelph |
Keywords: Semiconductor Manufacturing
Abstract: Stretching the definition of the standard Sine profile allows building a generalized symmetric frequency-aware basis function that can be used to generate reference motion trajectories. Other profiles such as polynomials, sigmoid, and harmonic-based models can be equally used under the proposed technique. Despite being suitable at the level of any higher-order time derivative, in this study, the generic basis function is realized at the jerk level such that the generated signals adhere to the limitations of the driven motion system. Introducing suitable time shifts, replicas of basis functions can be obtained giving rise to B-spline like frequency-aware profiles that can be used to realize the actual motion under any desired kinematical constraints, which are neatly written to reduces the computation burden at the motion controller side. Utilizing mainly the frequency-aware B-spline profiles, frequency-dependent random walk motion is presented and used to collect information about the driven motion system to help in characterizing any position-dependent errors through the statistical means, i.e. Analysis of Variance, and Design of Experiments. This allows dividing the working space in which motion takes place into several spatial regions with preferred frequency contents. The effectiveness of these proposed profiles is shown through hardware experiments using a precision motion system.
|
|
09:12-09:18, Paper WeAT5.8 | Add to My Program |
Flexible Gear Assembly with Visual Servoing and Force Feedback |
|
Ming, Junjie | Technical University of Munich |
Bargmann, Daniel | Fraunhofer IPA |
Cao, Hongpeng | Technical University of Munich |
Caccamo, Marco | Technical University of Munich |
Keywords: Assembly, Visual Servoing, Reinforcement Learning
Abstract: This paper presents a vision-guided two-stage approach with force feedback to achieve high-precision and flexible gear assembly. The proposed approach integrates YOLO to coarsely localize the target workpiece in a searching phase and deep reinforcement learning (DRL) to complete the insertion. Specifically, DRL addresses the challenge of partial visibility when the on-wrist camera is too close to the workpiece of a small size. Moreover, we use force feedback to improve the robustness of the vision-guided assembly process. To reduce the effort of collecting training data on real robots, we use synthetic RGB images for training YOLO and construct an offline interaction environment leveraging sampled real-world data for training DRL agents. The proposed approach was evaluated in an industrial gear assembly experiment, which requires an assembly clearance of 0.3mm, demonstrating high robustness and efficiency in gear searching and insertion from arbitrary positions.
|
|
09:18-09:24, Paper WeAT5.9 | Add to My Program |
Robotic Powder Grinding with Audio-Visual Feedback for Laboratory Automation in Materials Science |
|
Nakajima, Yusaku | SOKENDAI |
Hamaya, Masashi | OMRON SINIC X Corporation |
Tanaka, Kazutoshi | OMRON SINIC X Corporation |
Hawai, Takafumi | Osaka University |
von Drigalski, Felix Wolf Hans Erich | Mujin Inc |
Takeichi, Yasuo | Osaka University |
Ushiku, Yoshitaka | OMRON SINIC X Corpolation |
Ono, Kanta | Osaka University |
Keywords: Robotics and Automation in Life Sciences
Abstract: This study focuses on the powder grinding process, which is a necessary step for material synthesis in materials science experiments. In material science, powder grinding is a time-consuming process that is typically executed by hand, as commercial grinding machines are unsuitable for samples of small size. Robotic powder grinding would solve this problem, but it is a challenging task for robots, as it requires observing the powder state and generating appropriate motions. Our previous study proposed a robotic powder grinding system using visual feedback. Although visual feedback is helpful for observing the powder distribution, the particle size during the grinding process remains invisible, leading to suboptimal robot actions. In some cases, the robot chose to gather the powder even though continuing to grind instead would have produced finer powder. In this paper, we present a multi-modal robotic grinding system that utilizes both audio and visual feedback. It makes use of the grinding sound which carries information about the grinding progress, as the particle size strongly affects the audio intensity. The audio feedback enables the robot to grind until the powder is sufficiently fine. In our experiments, the robot ground 80.5% of the powder to a particle size smaller than 250 μm with audio and visual feedback and 68% without audio feedback, indicating that multi-modal feedback is an effective tool to produce finer powder. We conclude that the addition of audio feedback provides crucial information to the robot, allowing it to better understand the progress of the grinding process and make more optimal decisions. This robot system can be used to prepare samples in material science experiments and analyze the grinding process.
|
|
09:24-09:30, Paper WeAT5.10 | Add to My Program |
Deep Learning-Based Leaf Detection for Robotic Physical Sampling with P-AgBot |
|
Deb, Aarya | Purdue University |
Kim, Kitae | Purdue University |
Cappelleri, David | Purdue University |
Keywords: Robotics and Automation in Agriculture and Forestry, Deep Learning in Grasping and Manipulation, RGB-D Perception
Abstract: Automating leaf detection and physical leaf sample collection using Internet of Things (IoT) technologies is a crucial task in precision agriculture. In this paper, we present a deep learning-based approach for detecting and segmenting crop leaves for robotic physical sampling. We discuss a method for generating a physical dataset of agricultural crops. Our proposed pipeline incorporates using an RGB-D camera for dataset collection, fusing the depth frame along with RGB images to train Mask R-CNN and YOLOv5 models. We also propose our novel leaf pose estimating algorithm for physical sampling and maximizing leaf sample area while using a robotic arm integrated to the P-AgBot platform. The proposed approach has been experimentally validated on corn and sorghum, in both indoor and outdoor environments. Our method has achieved a best-case detection rate of 90.6%, a 9% smaller error compared to our previous method, and approximately 80% smaller error compared to other state-of-the-art methods in estimating the leaf position.
|
|
09:30-09:36, Paper WeAT5.11 | Add to My Program |
In-Situ Measurement of Extrusion Width for Fused Filament Fabrication Process Using Vision and Machine Learning Models |
|
Shabani, Arya | University of Bath |
Martinez-Hernandez, Uriel | University of Bath |
Keywords: Computer Vision for Manufacturing, Object Detection, Segmentation and Categorization, Additive Manufacturing
Abstract: Measuring geometry of the printing road is key for detection of anomalies in 3D printing processes. Although commercial 3D printers can measure the extrusion height using various distance sensors, measuring of the width in realtime remains a challenge. This paper presents a visual in-situ monitoring system to measure width of the printing filament road in 2D patterns. The proposed system is composed of a printable shroud with embedded camera setup and a visual detection approach based on a two-stage instance segmentation method. Each of the segmentation and localization stages can use multiple computational approaches including Gaussian mixture model, color filter, and deep neural network models. The visual monitoring system is mounted on a standard 3D printer and validated with the measurement of printed filament roads of sub-millimeter widths. The results on accuracy and robustness reveal that combinations of deep models for both segmentation and localization stages have better performance. Particularly, fully connected CNN segmentation model combined with YOLO object detector can measure sub-millimeter extrusion width with 90 µm accuracy at 125 ms speed. This visual monitoring system has potential to improve the control of printing processes by the real-time measurement of printed filament geometry.
|
|
09:36-09:42, Paper WeAT5.12 | Add to My Program |
Motion Orchestration in Dual-Stage Wafer Scanners |
|
Al-Rawashdeh, Yazan | Memorial University of Newfoundland |
Al Janaideh, Mohammad | University of Guelph |
Heertjes, Marcel | Eindhoven University of Technology |
Keywords: Semiconductor Manufacturing
Abstract: In the field of semiconductor manufacturing, lithography machines are becoming more and more sophisti- cated system of systems. As an example, a TWINSCAN wafer scanner machine is composed of a wafer, and reticle handlers, reticle, optics, and two wafer chains or systems. In previous studies, we covered the interactions between the reticle, optics, and wafer chains during the step-and-scan cycle. In this study, we focus on the interaction between the additional wafer chain responsible for aligning the wafer substrate and taking its height map during the measurement cycle, and the other chains that are active during the step-and-scan cycle. Working in parallel to increase machine throughput, the inertial forces resulting from the associated motion of the two cycles induce vibration that may propagate throughout the chains in the ma- chine if no appropriate measures are taken. In this investigation, we look at the reference trajectories responsible for steering the chains throughout the two cycles, and propose a reference trajectory orchestration that factors in the machine design, geometry, mass distribution, and functions. Consequently, this orchestration leads to suppressing the induced vibration without sacrificing the machine throughput while keeping the involved control loops intact. Also, we provide a means to generate the force and moment vector fields that can be used to access the machine performance under any designed reference trajectory
|
|
09:42-09:48, Paper WeAT5.13 | Add to My Program |
Towards Packaging Unit Detection for Automated Palletizing Tasks |
|
Völk, Markus | Fraunhofer IPA |
Kleeberger, Kilian | Fraunhofer IPA |
Kraus, Werner | Fraunhofer IPA |
Bormann, Richard | Fraunhofer IPA |
Keywords: Logistics, Perception for Grasping and Manipulation, RGB-D Perception
Abstract: For various automated palletizing tasks, the detection of packaging units is a crucial step preceding the actual handling of the packaging units by an industrial robot. We propose an approach to this challenging problem that is fully trained on synthetically generated data and can be robustly applied to arbitrary real world packaging units without further training or setup effort. The proposed approach is able to handle sparse and low quality sensor data, can exploit prior knowledge if available and generalizes well to a wide range of products and application scenarios. To demonstrate the practical use of our approach, we conduct an extensive evaluation on real-world data with a wide range of different retail products. Further, we integrated our approach in a lab demonstrator and a commercial solution will be marketed through an industry partner.
|
|
WeAT6 Regular session, 140FG |
Add to My Program |
Soft Robot Design and Modelling |
|
|
Chair: Huang, Xiaonan | University of Michigan |
Co-Chair: Bern, James | Williams College |
|
08:30-08:36, Paper WeAT6.1 | Add to My Program |
Robotic Barrier Construction through Weaved, Inflatable Tubes |
|
Kim, Heather Jin Hee | Cornell University |
Abdel-Raziq, Haron | Cornell University |
Liu, Xinyu | Cornell University |
Siskovic, Alexandra | Cornell University |
Patil, Shreyas | Cornell University |
Petersen, Kirstin Hagelskjaer | Cornell University |
Kao, Hsin-Liu (Cindy) | Cornell University |
Keywords: Soft Robot Materials and Design, Soft Robot Applications, Robotics and Automation in Construction
Abstract: In this article, we present a mechanism and related path planning algorithm to construct light-duty barriers out of extruded, inflated tubes weaved around existing environmental features. Our extruded tubes are based on everted vine-robots and in this context, we present a new method to steer their growth. We characterize the mechanism in terms of accuracy resilience, and, towards their use as barriers, the ability of the tubes to withstand distributed loads. We further explore an algorithm which, given a feature map and the size and direction of the external load, can determine where and how to extrude the barrier. Finally, we showcase the potential of this method in an autonomously extruded two-layer wall weaved around three pipes. While preliminary, our work indicates that this method has the potential for barrier construction in cluttered environments, e.g. shelters against wind or snow. Future work may show how to achieve tighter weaves, how to leverage weave friction for improved strength, how to assess barrier performance for feedback control, and how to operate the extrusion mechanism off of a mobile robot.
|
|
08:36-08:42, Paper WeAT6.2 | Add to My Program |
Bistable Tensegrity Robot with Jumping Repeatability Based on Rigid Plate-Shaped Compressors |
|
Shimura, Kento | Shinshu University |
Iwamoto, Noriyasu | Shinshu Univ |
Umedachi, Takuya | Shinshu University |
Keywords: Soft Robot Materials and Design, Tendon/Wire Mechanism, Space Robotics and Automation
Abstract: Abstract— This study presents a bistable tensegrity robot that can perform repetitive jumps using one motor. This robot is based on a tensegrity structure that uses rigid plate-shaped compressors. To achieve bistability in this structure, we optimized the position of additional springs using a physics simulator that considers geometric constraints attributed to the collision between compression materials. A prototype was constructed based on the simulation model. To achieve jumping repeatability, we used one motor to control three tendons, each used; to control the additional spring strain, trigger the snap-through motion, and reform the structure to its original form. The prototype could jump using snap-through motion and reform back to its original form based on motor rotation. Furthermore, the robot demonstrated its ability to jump over flights of stairs by attaching a stand with a slight angle and using jumping repeatability.
|
|
08:42-08:48, Paper WeAT6.3 | Add to My Program |
Accessible Soft Robotics Education with Re-Configurable Balloon Robots |
|
Wu, Yi-Shiun | EPFL |
Gilday, Kieran | EPFL |
Hughes, Josie | EPFL |
Keywords: Soft Robot Materials and Design, Education Robotics, Soft Robot Applications
Abstract: Soft robotics requires effective tools to educate the next generation of engineers and researchers. Stemming from a lack of universally accepted principles for education and with high barriers to entry in terms of fabrication and hardware, education to date has been highly ad hoc. We present a low-cost toolkit based on re-configurable balloon which allows rapid development of soft yet functional robots. This provides practical demonstrations of key soft robotic principles including: morphology, stiffness control, controller dependencies and modulation of environmental interactions, while grounding robot behaviours in fundamental mechanical models. We provide a framework for assembling balloon structures, incorporating actuation and exploring interactions. A diverse set of robots have been developed to show the potential to use this balloon-bots for educational activities for undergraduate teaching or below. In particular, different modes of locomotion are shown using robots each of which has an assembly time under 5~minutes. These robots can teach skills ranging from component integration and implementation, to key soft robotic design principles and embodied intelligence.
|
|
08:48-08:54, Paper WeAT6.4 | Add to My Program |
A Fabrication and Simulation Recipe for Untethering Soft-Rigid Robots with Cable-Driven Stiffness Modulation |
|
Bern, James | Williams College |
Patterson, Zachary | MIT |
Zamora Yańez, Leonardo | Massachusetts Institute of Technology |
Misquitta, Kristoff | MIT |
Rus, Daniela | MIT |
Keywords: Soft Robot Materials and Design, Soft Robot Applications, Modeling, Control, and Learning for Soft Robots
Abstract: We explore the idea of robotic mechanisms that can shift between soft and rigid states, with the long-term goal of creating robots that marry the flexibility and robustness of soft robots with the strength and precision of rigid robots. We present a simple yet effective method to achieve large and rapid stiffness variations by compressing and relaxing a flexure using cables. Next, we provide a differentiable modeling framework that can be used for motion planning, which simultaneously reasons about the modulated stiffness joints, tendons, rigid joints, and basic hydrodynamics. We apply this stiffness tuning and simulation recipe to create sori, an untethered soft-rigid robotic sea turtle capable of various swimming maneuvers.
|
|
08:54-09:00, Paper WeAT6.5 | Add to My Program |
Integrated Design of a Robotic Bio-Inspired Trunk |
|
Chevillon, Tanguy | Junia |
Mbakop, Steeve | Junia |
Tagne, Gilles | Yncréa Hauts De France / ISEN Lille |
Merzouki, Rochdi | CRIStAL, CNRS UMR 9189, University of Lille1 |
Keywords: Soft Robot Materials and Design, Modeling, Control, and Learning for Soft Robots, Soft Robot Applications
Abstract: Soft-Continuum Manipulators are of increasing interest to researchers for various non-destructive applications (minimally invasive surgery, fibroscopy, oncology, pipe exploration and many others). They are made with soft material or special arrangement of actuators allowing them to exhibit resilience and dexterity. The concept of Proprioceptive Soft-Continuum Manipulators still remains a major challenge for soft roboticists due to the big issues related to manufacturing process which becomes very expensive including sophisticated or experimental tools, highly skilled technicians and time. However, the manipulator proprioception is very usefull for enhancing the dexterity during their manipulation. Henceforth, this paper investigates a quick and simple approach for the integrated design of a proprioceptive Soft Robotic Bio-Inspired Trunk made with dragon skin 30 Material. This soft manipulator is made up of two segments composed of three independent physical control inputs each. It has an embedded electronics mainly composed of IMUs. The latter have allowed controlling the shape kinematics using a control-oriented modeling approach inspired from the kinematics control of a puppet toy. The developed modeling approach is a Reduced Order Modeling (ROM) which uses Pythagorean Hodograph (PH) curves which lowers in real time, the control dimension of the robot to virtual control points of its representative PH curve. The proposed investigation presents also a comprehensive approach for the manufacturing process of Soft-Continuum Manipulators with complex geometry.
|
|
09:00-09:06, Paper WeAT6.6 | Add to My Program |
A New Design of Multilayered String Jamming Mechanism with Three-Degree-Of-Freedom |
|
Michikawa, Ryohei | Kyoto University |
Tadakuma, Kenjiro | Tohoku University |
Matsuno, Fumitoshi | Kyoto University |
Keywords: Soft Robot Materials and Design, Tendon/Wire Mechanism, Mechanism Design
Abstract: A robot must exhibit softness so as not to accidentally damage its environment. However, stiffness is also necessary, so that the robot can transmit forces and perform tasks. In soft robotics, it is desirable to be able to switch between two states, namely a flexible state for adaption to the environment and a rigid state for the transmission of forces. String jamming mechanisms, which comprise many units connected in a bead-like pattern, have received attention for their ability to switch between flexible and rigid states. In this study, we propose a new design of the string jamming mechanism that enhances the maximum stiffness in the rigid state while maintaining a high fitting performance for the environment in the flexible state. We evaluate the fitting of the mechanism to the environment in a qualitative geometric discussion and compare the performance of the mechanism with that of existing string jamming mechanisms. The results of experiments measuring the maximum stiffness show the usefulness of the proposed mechanism from a quantitative point of view.
|
|
09:06-09:12, Paper WeAT6.7 | Add to My Program |
Protective Skin Mechanism with an Exhaustive Arrangement of Tiny Rigid Bodies for Soft Robots –Evaluation of Puncture Resistance, Elasticity, and Descaling Resistance of the Scale Mechanism– |
|
Tadakuma, Kenjiro | Tohoku University |
Tetsui, Hikaru | Tohoku University |
Watanabe, Masahiro | Tohoku University |
Tadokoro, Satoshi | Tohoku University |
Keywords: Soft Robot Materials and Design, Mechanism Design, Biomimetics
Abstract: Soft actuators have several advantages, including large deformation, safety and adaptability to the environment, and shock absorbance. However, they are weak against sharp objects owing to their soft bodies. This paper proposes a protective skin mechanism with an exhaustive arrangement of tiny rigid bodies. Small pieces were sewed on an elastic sheet using Kevlar strings. We conducted some measurements of puncture resistance, elasticity, and descaling resistance of the scale mechanisms. The results indicated that the proposed scale mechanism had only a 150% larger elasticity than a simple silicone sheet. In addition, approximately 15 N was required for descaling, which is seven times large than that of the glued scale. There was no puncture even when pricked with a needle.
|
|
09:12-09:18, Paper WeAT6.8 | Add to My Program |
Printable Bistable Structures for Programmable Frictional Skins of Soft-Bodied Robots |
|
Ta, Tung D. | The University of Tokyo |
Kawahara, Yoshihiro | The University of Tokyo |
Keywords: Soft Robot Materials and Design, Mechanism Design, Biologically-Inspired Robots
Abstract: Soft robots made of flexible materials are highly adaptive, easy to fabricate, and safer to interact with. One of the ways for soft robots to interact with the surrounding environment is through their deformable bodily characteristics including internal body stiffness and external body friction. Though the flexibility of soft-bodied robots has been rigorously studied, the frictional skin of such soft-bodied robots, acting as a mechanical interface between the robot and the environment, remains unexplored. Being able to design the frictional skin will make soft-bodied robots more versatile in environmental navigation, more dexterous in manipulation tasks, and more flexible in haptic feedback. In this paper, we propose a robotic skin that can be programmed dynamically to change the mode of friction. The robotic skin is based on bistable bellow structures that can be switched between two folding states to change the contact points between the robotic skin and the ground. Our robotic skin can dynamically change its anisotropic frictional behavior to add another dimension to the designing space of soft robotics.
|
|
09:18-09:24, Paper WeAT6.9 | Add to My Program |
MCLARI: A Shape-Morphing Insect-Scale Robot Capable of Omnidirectional Terrain-Adaptive Locomotion in Laterally Confined Spaces |
|
Kabutz, Heiko Dieter | University of Colorado Boulder |
Hedrick, Alexander | University of Colorado Boulder |
McDonnell, William Parker | University of Colorado Boulder |
Jayaram, Kaushik | University of Colorado Boulder |
Keywords: Soft Robot Materials and Design, Legged Robots, Biologically-Inspired Robots
Abstract: Soft compliant microrobots have the potential to deliver significant societal impact when deployed in applications such as search and rescue. In this research we present mCLARI, a body compliant quadrupedal microrobot of 20 mm neutral body length and 0.97 g, improving on its larger predecessor, CLARI. This robot has four independently actuated leg modules with 2 degrees of freedom, each driven by piezoelectric actuators. The legs are interconnected in a closed kinematic chain via passive body joints, enabling passive body compliance for shape adaptation to external constraints. Despite scaling its larger predecessor down to 60% in length and 38% in mass, mCLARI maintains 80% of the actuation power to achieve high agility. Additionally, we demonstrate the new capability of passively shape-morphing mCLARI -- omnidirectional laterally confined locomotion -- and experimentally quantify its running performance achieving a new unconstrained top speed of 3 bodylengths/s (60 mm/s). Leveraging passive body compliance, mCLARI can navigate through narrow spaces with a body compression ratio of up to 1.5x the neutral body shape.
|
|
09:24-09:30, Paper WeAT6.10 | Add to My Program |
“RobOstrich” Manipulator: A Novel Mechanical Design and Control Based on the Anatomy and Behavior of an Ostrich Neck |
|
Nakano, Kazashi | The University of Tokyo |
Gunji, Megu | National Museum of Nature and Science, Tokyo |
Ikeda, Masahiro | University of Tokyo |
Or, Keung | Shinshu University |
Ando, Mitsuhito | University of Tsukuba |
Inoue, Katsuma | The University of Tokyo |
Mochiyama, Hiromi | University of Tsukuba |
Nakajima, Kohei | University of Tokyo |
Niiyama, Ryuma | University of Tokyo |
Kuniyoshi, Yasuo | The University of Tokyo |
Keywords: Soft Robot Materials and Design, Modeling, Control, and Learning for Soft Robots, Biologically-Inspired Robots
Abstract: Flexible manipulators have high degrees of freedom and deformability, enabling dexterous movements and allowing for unexpected contacts with the environment. Underactuated tendon-drive mechanisms are the most widely adopted because of their simplicity and effectiveness. However, they suffer from difficulty in modeling and in achieving dexterity and structural stability. In this paper, we focus on ostriches, which can dexterously and swiftly move their flexible necks. We carried out a detailed dissection of ostrich necks and discovered a specific musculo-tendon-skeletal structure. Based on the findings related to the structure, we came up with a novel mechanical design and control method manifested as a ``RobOstrich'' manipulator. Our actual robot experiments confirm that it exhibits similar movement patterns as an ostrich neck. It is also flexible yet structurally stable, enabling dexterous reaching movements. This work also contributes to biology by providing constructive understanding of the functionality of the morphology of an ostrich neck.
|
|
09:30-09:36, Paper WeAT6.11 | Add to My Program |
Bio-Inspired Deformable Propeller Concept for Smooth Human-UAV Interaction and Efficient Thrust Generation |
|
Ruiz Vincueria, Fernando | Universidad De Sevilla |
Arrue, Begońa C. | Universidad De Sevilla |
Ollero, Anibal | AICIA. G41099946 |
Keywords: Soft Robot Materials and Design, Safety in HRI, Aerial Systems: Mechanics and Control
Abstract: This letter presents a novel deformable propeller concept, 3D-printed in flexible thermoplastic polyurethane, which stores impact energy in the form of elastic energy for smooth collisions, reducing risk in human-UAV interactions. However, such a design experiences elastic deformations when exposed to aerodynamic and centrifugal loads. The most relevant occur in the 3-5 krpm range. At higher speeds, the centrifugal force is dominant and the propeller becomes quite stiff. This work provides an investigation of the propeller deformation angles (which ultimately alter the aerodynamic profile and limit thrust generation) based on Fluid-Structure Interaction (FSI) simulations. These results are leveraged to introduce two complementary solutions: deformation reduction oriented internal fiber distributions, inspired by the ultra-efficient dragonfly wings (I); anticipatory designs which pre-modify pitch and roll angles based on simulation results to ensure optimal performance at the target rotational velocity (II). This letter offers a multi-configuration analysis, yielding an easy to manufacture propeller with a specific design methodology which results in increased efficiency and reduced impact recovery time. This work presents collision tests with various objects, as well as a proof of concept for its flight capabilities in a conventional quadrotor, including physical interaction with humans. These findings are valuable for the development of collision control strategies for UAVs.
|
|
09:36-09:42, Paper WeAT6.12 | Add to My Program |
A High-Strength, Highly-Flexible Robotic Strap for Harnessing, Lifting, and Transferring Humans |
|
Barhydt, Kentaro | Massachusetts Institute of Technology |
Asada, Harry | MIT |
Keywords: Soft Robot Materials and Design, Compliant Joints and Mechanisms
Abstract: Safely harnessing and lifting humans for transfer is a challenging unsolved problem for current robots because of the high forces and gentle interaction necessary to do so. Straps, however, are highly beneficial for manually performing this task primarily because of their simultaneously high tensile strength and high compliant bending flexibility. We propose the Robotic Strap, a novel concept and design for a new type of manipulator that can passively harness and lift humans safely as straps can, as well as actively articulate itself around the human into the desired harnessing configurations. The passive structure is characterized by the high tensile strength and bending flexibility of straps. The design consists of a hyper-articulated backbone with rolling-contact joints fastened by dual-pulley cord mechanisms, and soft thin McKibben artificial muscles embedded along its length that collectively actuate the joints. We present the concept, framework, realization, and implementation of the Robotic Strap design, as well as model and experimentally validate the key characteristics. The prototype has a tensile load capacity of 1314.0 N, a maximum joint bending resistance of <0.1 Nm, and successfully demonstrated safe and effective harnessing and lifting of three human participants without any manual intervention.
|
|
09:42-09:48, Paper WeAT6.13 | Add to My Program |
Flexible and Slim Device Switching Air Blowing and Suction by a Single Airflow Control |
|
Nojiri, Seita | Kanazawa University |
Nishimura, Toshihiro | Kanazawa University |
Tadakuma, Kenjiro | Tohoku University |
Watanabe, Tetsuyou | Kanazawa University |
Keywords: Soft Robot Materials and Design, Hydraulic/Pneumatic Actuators
Abstract: This study proposes a soft robotic device with a slim and flexible body that switches between air blowing and suction with a single airflow control. Suction is achieved by jet flow entraining surrounding air, and blowing is achieved by blocking and reversing jet flow. The thin and flexible flap gate enables the switching. Air flow is blocked while the gate is closed and passes through while the gate is open. The opening and closing of the flap gate are controlled by the expansion of the inflatable chambers installed near the gate. The extent of expansion is determined by the upstream static pressure. Therefore, the gate can be controlled by the input airflow rate. The dimensions of the flap gate are introduced as a design parameter, and we show that the parameter contributes to the blowing and suction capacities. We also experimentally demonstrate that the proposed device is available for a variable friction system and an end effector for picking up a thin object covered with dust.
|
|
WeAT7 Regular session, 258/259 |
Add to My Program |
Surgical Robotics - Control |
|
|
Chair: Dupont, Pierre | Children's Hospital Boston, Harvard Medical School |
Co-Chair: Julien, Leclerc | University of Houston |
|
08:30-08:36, Paper WeAT7.1 | Add to My Program |
Flexible Needle Bending Model for Spinal Injection Procedures |
|
Wang, Yanzhou | Johns Hopkins University |
Kwok, Ka-Wai | The University of Hong Kong |
Cleary, Kevin | Children's National Medical Center |
Taylor, Russell H. | The Johns Hopkins University |
Iordachita, Ioan Iulian | Johns Hopkins University |
Keywords: Medical Robots and Systems, Surgical Robotics: Steerable Catheters/Needles
Abstract: An in situ needle manipulation technique used by physicians when performing spinal injections is modeled to study its effect on needle shape and needle tip position. A mechanics-based model is proposed and solved using finite element method. A test setup is presented to mimic the needle manipulation motion. Tissue phantoms made from plastisol as well as porcine skeletal muscle samples are used to evaluate the model accuracy against medical images. The effect of different compression models as well as model parameters on model accuracy is studied, and the effect of needle-tissue interaction on the needle remote center of motion is examined. With the correct combination of compression model and model parameters, the model simulation is able to predict needle tip position within submillimeter accuracy.
|
|
08:36-08:42, Paper WeAT7.2 | Add to My Program |
FABRIKv : A Fast, Iterative Inverse Kinematics Solver for Surgical Continuum Robot with Variable Curvature Model |
|
Wang, Fuhao | Fudan University |
Wang, Ye | Fudan University |
Kang, Xiaoyang | Fudan University |
Wang, Hongbo | Fudan University |
Luo, Jingjing | Fudan University |
Chen, Li | Fudan |
Tang, Xiuhong | Fudan University |
Keywords: Tendon/Wire Mechanism, Surgical Robotics: Laparoscopy, Medical Robots and Systems
Abstract: Due to the advantages of high flexibility, large workspace, and good human-body compatibility, flexible tendon-driven surgical continuum robots have attracted a lot of attention in robot-assisted minimally invasive surgery. However, due to the coupling of the position and angle of the continuum robot, and the easy deformation of the external force, its inverse kinematics solution has always been a challenge. This paper proposes a fast inverse kinematics solver for surgical continuum robots with a variable curvature model. Firstly, the deformation of the continuum robot is analyzed, and a representation method of the variable curvature model is proposed. Next, to solve the inverse kinematics problem when the continuum robot deforms under load, FABRIKv is proposed by improving the Forward And Backward Reaching Inverse Kinematics (FABRIK). During the inverse kinematics solution, the algorithm preserves the real-time nature of FABRIK and corrects for deformation effects caused by the load. Finally, the experiment verifies the rationality and effectiveness of the variable curvature model representation method, as well as the fastness and accuracy of the FARIKv solver.
|
|
08:42-08:48, Paper WeAT7.3 | Add to My Program |
Design and Synchronous Control of a Magnetically-Actuated and Ultrasound-Guided Multi-Arm Robotic System |
|
Li, Zhengyang | University of Macau |
Xu, Qingsong | University of Macau |
Keywords: Medical Robots and Systems, Motion Control, Surgical Robotics: Steerable Catheters/Needles
Abstract: This paper presents the design of a new multi-arm robotic system with mobile magnetic actuation and extracorporeal ultrasound guidance dedicated to magnetic catheterization. The kinematic model of the external mobile actuation arm (EMAA) and extracorporeal ultrasound-integrated tracking arm (EUTA) are derived based on Denavit-Hartenberg (DH) parameters, including specially designed end-effectors. The synchronous control scheme for the mobile magnet and mobile ultrasound probe is introduced with polar coordinate-based magnetic actuation and visual servo-based ultrasound tracking method. Meanwhile, a denoising algorithm based on Speckle Reduction Anisotropic Diffusion (SRAD) is implemented. The effectiveness of the proposed robotic system has been verified by conducting several experimental studies, e.g., ex-vivo tests of catheter steering in endovascular phantom and soft tissue-imitating phantom with the average error of 0.32 mm and signal-to-noise-ratio (SNR) of 12.2 for the ultrasound imaging.
|
|
08:48-08:54, Paper WeAT7.4 | Add to My Program |
A Shared-Control Dexterous Robotic System for Assisting Transoral Mandibular Fracture Reduction: Development and Cadaver Study |
|
Wang, Yan | The Chinese University of Hong Kong |
Zheng, Hao | The Hong Kong Polytechnic University |
Lee, Yu-Chung | Cornerstone Robotics Ltd |
Chan, Catherine Po Ling | The Chinese University of Hong Kong |
Chan, Ying-Kuen | The Chinese University of Hong Kong |
Taylor, Russell H. | The Johns Hopkins University |
Au, K. W. Samuel | The Chinese University of Hong Kong |
Keywords: Medical Robots and Systems
Abstract: The rigid and straight nature of conventional surgical drills and screwdrivers makes it difficult to access the posterior mandible for fracture reduction without the creation of facial incisions. To assist transoral mandibular fracture reduction in hard-to-reach areas, we propose a shared-control dexterous robotic system. The end effector of this system is an articulated drilling/screwing tool to provide distal dexterity. This system uses an admittance-control-based approach to provide precision and stability during shared-control hole-drilling processes. A cadaver study showed the efficacy of the proposed system to assist plate fixation in the reduction of mandibular fractures. The proposed articulated surgical tool was capable of drilling holes in and driving screws into the mandible of a cadaver head. In addition, the shared-control robotic system ensured that the drill moved along its axial direction, leading to stable and precise hole drilling.
|
|
08:54-09:00, Paper WeAT7.5 | Add to My Program |
Model-Based Bending Control of Magnetically-Actuated Robotic Endoscopes for Automatic Retroflexion in Confined Spaces |
|
Yichong, Sun | The Chinese University of Hong Kong |
Li, Yehui | The Chinese University of Hong Kong |
Li, Jixiu | The Chinese University of Hong Kong |
Ng, Wing Yin | The Chinese University of Hong Kong |
Xian, Yitian | The Chinese University of Hong Kong |
Huang, Yisen | The Chinese University of Hong Kong |
Chiu, Philip, Wai-yan | Chinese University of Hong Kong |
Li, Zheng | The Chinese University of Hong Kong |
Keywords: Medical Robots and Systems, Kinematics, Motion Control
Abstract: This paper is concerned with the issue of the kinematic model-based bending control for the magnetically actuated robotic endoscope and its application for automatic retroflexion. By the utilization of the Cosserat rod theory and the transformation in the magnetic tip of the endoscope, the comprehensive kinematic model of the magnetically-actuated robotic endoscope is established. Afterward, a magnetic control scheme for the bending motion is proposed by co-developing an error feedback PID control strategy and the model-based feedback approach. Moreover, as one unique kind of bending motion, retroflexion is taken into account, and the strategy aimed at the bid of compact space retroflexion is presented by virtue of the introduction of serial waypoints pursuing the position of the magnetic tip being close to the midline as possible. Eventually, the developed modeling and bending control scheme and the compact space retroflexion strategy are examined in a magnetically actuated robotic endoscope system to manifest the effectiveness and applicability of the theoretical approach. The experimental results indicate that the designed controller can drive the endoscope to bend to the desired pose and show a reduction of about 47.01% in the sweeping area and 79.25% in the last distance to midline achieved by conducting compact space retroflexion in comparison to “U” type one.
|
|
09:00-09:06, Paper WeAT7.6 | Add to My Program |
3D Laser-And-Tissue Agnostic Data-Driven Method for Robotic Laser Surgical Planning |
|
Ma, Guangshen | Duke University |
Prakash, Ravi | Duke University |
Mann, Brian | Duke University |
Ross, Weston | Duke University |
Codd, Patrick | Children's Hospital, Boston |
Keywords: Medical Robots and Systems, Surgical Robotics: Planning, Computer Vision for Medical Robotics
Abstract: In robotic laser surgery, shape prediction of an one-shot ablation crater is an important problem for minimizing errant overcutting of healthy tissue during the course of pathological tissue resection and precise tumor removal. Since it is difficult to physically model the laser-tissue interaction due to the variety of optical tissue properties, complicated process of heat transfer, and uncertainty about the chemical reaction, we propose a 3D crater prediction model based on an entirely data-driven method without any assumptions of laser settings and tissue properties. Based on the crater prediction model, we formulate a novel robotic laser planning problem to determine the optimal laser incident configuration, which aims to create a crater that aligns with the surface target (e.g. tumor, pathological tissue). To solve the one-shot ablation crater prediction problem, we model the 3D geometric relation between the tissue surface and the laser energy profile as a non-linear regression problem that can be represented by a single-layer perceptron (SLP) network. The SLP network is encoded in a novel kinematic model to predict the shape of the post-ablation crater with an arbitrary laser input. To estimate the SLP network parameters, we formulate a dataset of one-shot laser-phantom craters reconstructed by the optical coherence tomography (OCT) B-scan images. To verify the method. The learned crater prediction model is applied to solve a simplified robotic laser planning problem modelled as a surface alignment error minimization problem. The initial results report about (91.2 +- 3.0)% 3D-crater-Intersection-over-Union (3D-crater-IoU) for the 3D crater prediction and an average of about 98.0% success rate for the simulated surface alignment experiments.
|
|
09:06-09:12, Paper WeAT7.7 | Add to My Program |
Insertion, Retrieval and Performance Study of Miniature Magnetic Rotating Swimmers for the Treatment of Thrombi |
|
Lu, Yitong | University of Houston |
Ramos, Jocelyn | University of Houston |
Ghosn, Mohamad | Houston Methodist DeBakey Heart and Vascular Center |
Shah, Dipan J. | Houston Methodist DeBakey Heart & Vascular Center |
Becker, Aaron | University of Houston |
Julien, Leclerc | University of Houston |
Keywords: Medical Robots and Systems, Engineering for Robotic Systems
Abstract: Miniature Magnetic Rotating Swimmers (MMRSs) are untethered machines containing magnetic materials. An external rotating magnetic field produces a torque on the swimmers to make them rotate. MMRSs have propeller fins that convert the rotating motion into forward propulsion. This type of robot has been shown to have potential applications in the medical realm. This paper presents new MMRS designs with (1) an increased permanent magnet volume to increase the available torque and prevent the MMRS from becoming stuck inside a thrombus; (2) new helix designs that produce an increased force to compensate for the weight added by the larger permanent magnet volume; (3) different head drill shape designs that have different interactions with thrombi. The two best MMRS designs were tested experimentally by removing a partially dried 1-hour-old thrombus with flow in a bifurcating artery model. The first MMRS disrupted a large portion of the thrombus. The second MMRS retrieved a small remaining piece of the thrombus. In addition, a tool for inserting, retrieving, and switching MMRSs during an experiment is presented and demonstrated. Finally, this paper shows that the two selected MMRS designs can perform accurate 3D path-following.
|
|
09:12-09:18, Paper WeAT7.8 | Add to My Program |
Hybrid Tendon and Ball Chain Continuum Robots for Enhanced Dexterity in Medical Interventions |
|
Pittiglio, Giovanni | Harvard University |
Mencattelli, Margherita | Boston Children's Hospital, Harvard Medical School |
Donder, Abdulhamit | Imperial College London |
Chitalia, Yash | University of Louisville |
Dupont, Pierre | Children's Hospital Boston, Harvard Medical School |
Keywords: Medical Robots and Systems, Surgical Robotics: Steerable Catheters/Needles, Flexible Robotics
Abstract: A hybrid continuum robot design is introduced that combines a proximal tendon-actuated section with a distal telescoping section comprised of permanent-magnet spheres actuated using an external magnet. While, individually, each section can approach a point in its workspace from one or at most several orientations, the two-section combination possesses a dexterous workspace. The paper describes kinematic modeling of the hybrid design and provides a description of the dexterous workspace. We present experimental validation which shows that a simplified kinematic model produces tip position mean and maximum errors of 3% and 7% of total robot length, respectively.
|
|
09:18-09:24, Paper WeAT7.9 | Add to My Program |
Semi-Autonomous Assistance for Telesurgery under Communication Loss |
|
Ishida, Hisashi | Johns Hopkins University |
Munawar, Adnan | Johns Hopkins University |
Taylor, Russell H. | The Johns Hopkins University |
Kazanzides, Peter | Johns Hopkins University |
Keywords: Medical Robots and Systems, Telerobotics and Teleoperation, Virtual Reality and Interfaces
Abstract: Telesurgery has a clear potential for providing high-quality surgery to medically underserved areas like rural areas, battlefields, and spacecraft; nevertheless, effective methods to overcome unreliable communication systems are still lacking. Furthermore, it is not well understood how users react at the moment of communication loss and also during the loss. In this paper, we aim to analyze human response by proposing a telesurgery simulation framework that models an environment incorporating local and remote sites. Furthermore, this framework generates structural data for human behavior analysis and can provide different forms of assistance during the communication failure and at the communication recovery. We investigated three different types of assistance: User-centered, Robot-centered and Hybrid. A 12-person user-study was carried out using the proposed telesurgery simulation where participants completed a peg transfer task with random communication loss. The collected data was used to analyze the human response to a communication failure. The proposed Hybrid method reduced temporal demand with no-additional completion time compared to the baseline control method where users were unable to move the input device during the communication loss. The Hybrid method also significantly reduced both the task completion time and workload compared with the other two proposed methods (User-centered, Robot-centered).
|
|
09:24-09:30, Paper WeAT7.10 | Add to My Program |
Improving Surgical Situational Awareness with Signed Distance Field: A Pilot Study in Virtual Reality |
|
Ishida, Hisashi | Johns Hopkins University |
Barragan, Juan Antonio | Johns Hopkins University |
Munawar, Adnan | Johns Hopkins University |
Li, Zhaoshuo | Johns Hopkins University |
Ding, Andy | The Johns Hopkins University School of Medicine |
Kazanzides, Peter | Johns Hopkins University |
Trakimas, Danielle | Johns Hopkins University |
Creighton, Francis | Johns Hopkins School of Medicine |
Taylor, Russell H. | The Johns Hopkins University |
Keywords: Medical Robots and Systems, Virtual Reality and Interfaces
Abstract: The introduction of image-guided surgical navigation (IGSN) has greatly benefited technically demanding surgical procedures by providing real-time support and guidance to the surgeon during surgery. hi{To develop effective IGSN, a careful selection of the surgical information and the medium to present this information to the surgeon is needed. However, this is not a trivial task due to the broad array of available options.} To address this problem, we have developed an open-source library that facilitates the development of multimodal navigation systems in a wide range of surgical procedures relying on medical imaging data. To provide guidance, our system calculates the minimum distance between the surgical instrument and the anatomy and then presents this information to the user through different mechanisms. The real-time performance of our approach is achieved by calculating Signed Distance Fields at initialization from segmented anatomical volumes. Using this framework, we developed a multimodal surgical navigation system to help surgeons navigate anatomical variability in a skull base surgery simulation environment. Three different feedback modalities were explored: visual, auditory, and haptic. To evaluate the proposed system, a pilot user study was conducted in which four clinicians performed mastoidectomy procedures with and without guidance. Each condition was assessed using objective performance and subjective workload metrics. This pilot user study showed improvements in procedural safety without additional time or workload. These results demonstrate our pipeline's successful use case in the context of mastoidectomy.
|
|
09:30-09:36, Paper WeAT7.11 | Add to My Program |
Development and Evaluation of a Single-Arm Robotic System for Autonomous Suturing |
|
Liu, Jiawei | Johns Hopkins University |
Kam, Michael | Johns Hopkins University |
Opfermann, Justin | Johns Hopkins University |
Zhang, Zheyuan | Johns Hopkins University |
Hsieh, Michael | Children's National Medical Center |
Kang, Jin | The Johns Hopkins University |
Krieger, Axel | Johns Hopkins University |
Keywords: Medical Robots and Systems, Mechanism Design, Surgical Robotics: Planning
Abstract: This article introduces a novel suture managing device (SMD) and new suture management controller to enable single-arm suture management during autonomous suturing with the Smart Tissue Autonomous Robot (STAR). The primary function of the SMD is to tension and manage the suture thread, a task that was previously carried out by a second manipulator or a human assistant. The SMD and its controller are integrated into STAR's autonomous suturing workflow. Experiments were conducted to quantify the tensioning force of SMD and to evaluate the suture quality of the new single-arm system. The prototype of SMD achieves 1.67N tensioning force with suturing time of 29.1pm0.42 secs per stitch. Our study results demonstrate that the single-arm STAR system with SMD achieves equivalent performance to our previous works in suturing efficiency where suture management was performed with either a dual-armed robotic system or by a human surgical assistant. The study's findings contribute to the field of medical robotics and to our knowledge represent the first known instance of single-arm suturing with suture management during autonomous anastomosis.
|
|
09:36-09:42, Paper WeAT7.12 | Add to My Program |
End-To-End Learning of Deep Visuomotor Policy for Needle Picking |
|
Lin, Hongbin | Chinese University of Hong Kong |
Li, Bin | The Chinese University of Hong Kong |
Chu, Xiangyu | The Chinese University of Hong Kong |
Dou, Qi | The Chinese University of Hong Kong |
Liu, Yunhui | Chinese University of Hong Kong |
Au, K. W. Samuel | The Chinese University of Hong Kong |
Keywords: Medical Robots and Systems, Deep Learning in Grasping and Manipulation, Computer Vision for Medical Robotics
Abstract: Needle picking is a challenging manipulation task in robot-assisted surgery due to the characteristics of small slender shapes of needles, needles' variations in shapes and sizes, and demands for millimeter-level control. Prior works, heavily relying on the prior of needles (e.g., geometric models), are hard to scale to unseen needles' variations. In this paper, we present the first end-to-end learning method to train deep visuomotor policy for needle picking. Concretely, we propose DreamerfD to maximally leverage demonstrations to improve the learning efficiency of a state-of-the-art model-based reinforcement learning method, DreamerV2; Since Variational Auto-Encoder (VAE) in DreamerV2 is difficult to scale to high-resolution images, we propose Dynamic Spotlight Adaptation to represent control-related visual signals in a low-resolution image space; Virtual Clutch is also proposed to reduce performance degradation due to significant error between prior and posterior encoded states at the beginning of a rollout. We conducted extensive experiments in simulation to evaluate the performance, robustness, in-domain variation adaptation, and effectiveness of individual components of our method. Our method, trained by 8k demonstration timesteps and 140k online policy timesteps, can achieve a remarkable success rate of 80%. Furthermore, our method effectively demonstrated its superiority in generalization to unseen in-domain variations including needle variations and image disturbance, highlighting its robustness and versatility. Codes and videos are available at https://sites.google.com/view/DreamerfD.
|
|
09:42-09:48, Paper WeAT7.13 | Add to My Program |
Value-Informed Skill Chaining for Policy Learning of Long-Horizon Tasks with Surgical Robot |
|
Huang, Tao | The Chinese University of Hong Kong |
Chen, Kai | The Chinese University of Hong Kong |
Wei, Wang | The Chinese University of Hong Kong |
Li, Jianan | The Chinese University of Hong Kong |
Long, Yonghao | The Chinese University of Hong Kong |
Dou, Qi | The Chinese University of Hong Kong |
Keywords: Surgical Robotics: Laparoscopy, Surgical Robotics: Planning
Abstract: Reinforcement learning is still struggling with solving long-horizon surgical robot tasks which involve multiple steps over an extended duration of time due to the policy exploration challenge. Recent methods try to tackle this problem by skill chaining, in which the long-horizon task is decomposed into multiple subtasks for easing the exploration burden and subtask policies are temporally connected to complete the whole long-horizon task. However, smoothly connecting all subtask policies is difficult for surgical robot scenarios. Not all states are equally suitable for connecting two adjacent subtasks. An undesired terminate state of the previous subtask would make the current subtask policy unstable and result in a failed execution. In this work, we introduce value-informed skill chaining (ViSkill), a novel reinforcement learning framework for long-horizon surgical robot tasks. The core idea is to distinguish which terminal state is suitable for starting all the following subtask policies. To achieve this target, we introduce a state value function that estimates the expected success probability of the entire task given a state. Based on this value function, a chaining policy is learned to instruct subtask policies to terminate at the state with the highest value so that all subsequent policies are more likely to be connected for accomplishing the task. We demonstrate the effectiveness of our method on three complex surgical robot tasks from SurRoL, a comprehensive surgical simulation platform, achieving high task success rates and execution efficiency. Code is available at https://github.com/med-air/ViSkill.
|
|
WeAT8 Regular session, 141 |
Add to My Program |
Humanoid and Bipedal Locomotion |
|
|
Chair: Hubicki, Christian | Florida State University |
Co-Chair: Hereid, Ayonga | Ohio State University |
|
08:30-08:36, Paper WeAT8.1 | Add to My Program |
Design of a Jumping Control Framework with Heuristic Landing for Bipedal Robots |
|
Zhang, Jingwen | University of California, Los Angeles |
Shen, Junjie | UCLA |
Liu, Yeting | UCLA |
Hong, Dennis | UCLA |
Keywords: Humanoid and Bipedal Locomotion, Whole-Body Motion Planning and Control, Optimization and Optimal Control
Abstract: Generating dynamic jumping motions on legged robots remains a challenging control problem as the full flight phase and large landing impact are expected. Compared to quadrupedal robots or other multi-legged robots, bipedal robots place higher requirements for the control strategy given a much smaller support polygon. To solve this problem, a novel heuristic landing planner is proposed in this paper. With the momentum feedback during the flight phase, landing locations can be updated to minimize the influence of uncertainties from tracking errors or external disturbances when landing. To the best of our knowledge, this is the first approach to take advantage of the flight phase to reduce the impact of the jump landing which is implemented in the actual robot. By integrating it with a modified kino-dynamics motion planner with centroidal momentum and a low-level controller which explores the whole-body dynamics to hierarchically handle multiple tasks, a complete and versatile jumping control framework is designed in this paper. Extensive results of simulation and hardware jumping experiments on a miniature bipedal robot with proprioceptive actuation are provided to demonstrate that the proposed framework is able to achieve human-like efficient and robust jumping tasks, including directional jump, twisting jump, step jump, and somersaults.
|
|
08:36-08:42, Paper WeAT8.2 | Add to My Program |
Proprioceptive External Torque Learning for Floating Base Robot and Its Applications to Humanoid Locomotion |
|
Lim, Daegyu | Seoul National University |
Kim, Myeong-Ju | Seoul National University |
Cha, Junhyeok Ruiyi | Seoul National University |
Kim, Donghyeon | Graduate School of Convergence Science and Technology, Seoul Nat |
Park, Jaeheung | Seoul National University |
Keywords: Humanoid and Bipedal Locomotion, Force and Tactile Sensing, Deep Learning Methods
Abstract: The estimation of external joint torque and contact wrench is essential for achieving stable locomotion of humanoids and safety-oriented robots. Although the contact wrench on the foot of humanoids can be measured using a force-torque sensor (FTS), FTS increases the cost, inertia, complexity, and failure possibility of the system. This paper introduces a method for learning external joint torque solely using proprioceptive sensors (encoders and IMUs) for a floating base robot. For learning, the GRU network is used and random walking data is collected. Real robot experiments demonstrate that the network can estimate the external torque and contact wrench with significantly smaller errors compared to the model-based method, momentum observer (MOB) with friction modeling. The study also validates that the estimated contact wrench can be utilized for zero moment point (ZMP) feedback control, enabling stable walking. Moreover, even when the robot's feet and the inertia of the upper body are changed, the trained network shows consistent performance with a model-based calibration. This result demonstrates the possibility of removing FTS on the robot, which reduces the disadvantages of hardware sensors.
|
|
08:42-08:48, Paper WeAT8.3 | Add to My Program |
Time to Danger, an Alternative to Passive Safety for the Locomotion of a Biped Robot in a Crowd |
|
Ciocca, Matteo | INRIA |
Wieber, Pierre-Brice | INRIA |
Fraichard, Thierry | INRIA |
Keywords: Humanoid and Bipedal Locomotion, Legged Robots, Collision Avoidance
Abstract: A biped robot walking in a crowd must avoid falls and collisions at the same time. The latter is usually addressed through textit{Passive Safety} (PS), which guarantees that the robot is at rest when a collision is inevitable. Since PS may limit the robot's mobility, the purpose of this work is to introduce and explore the novel concept of textit{Time To Danger} (TTD) as an alternative. For a given robot motion, TTD is the time where the robot enters the region that a person can potentially occupy in the future. After having studied the properties of TTD, a novel locomotion strategy is proposed: it follows a receding horizon Model Predictive Control scheme, and it computes an optimal locomotion plan that guarantees balance preservation and TTD maximization. Controlled experiments in a challenging simulated crowd scenario demonstrate how the novel locomotion strategy outperforms a Passive Safety-based locomotion strategy from a collision avoidance point of view.
|
|
08:48-08:54, Paper WeAT8.4 | Add to My Program |
ZMP Feedback Balance Control of Humanoid in Response to Ground Acceleration |
|
Konishi, Masanori | The University of Tokyo |
Kojima, Kunio | The University of Tokyo |
Okada, Kei | The University of Tokyo |
Inaba, Masayuki | The University of Tokyo |
Kawasaki, Koji | The University of Tokyo |
Keywords: Humanoid and Bipedal Locomotion, Body Balancing
Abstract: In order for a humanoid robot to balance on the movable ground, balance feedback control in response to its unpredictable movement is required. However, feedback control in response to ground movement has the following two issues, (A) Interaction between the ground dynamics and the balance control may cause vibration. (B) The balance control may rather deteriorate the stability due to the response delay. To solve these problems, this study proposes the support foot acceleration term in the walking stabilizer and gives its gain by considering the following two conditions, (A) Avoiding steady-state vibration in a two-mass linear inverted pendulum model on an arbitrary ground, and (B) reducing the influence of inertial forces resulting from the delay of ZMP feedback. Experiments with a life-size humanoid JAXON verified the steady-state vibration phenomenon and improved the stability of acceleration and deceleration when boarding the Two-Wheeled Scooter.
|
|
08:54-09:00, Paper WeAT8.5 | Add to My Program |
Manipulation of Center of Pressure for Bipedal Locomotion by Passive Twisting of Viscoelastic Trunk Joint and Asymmetrical Arm Swinging |
|
Takuma, Takashi | Osaka Institute of Technology |
Hashimoto, Ibuki | Osaka Institute of Technology |
Andachi, Ryo | Osaka Institute of Technology |
Sugimoto, Yasuhiro | Osaka Univ |
Aoi, Shinya | Osaka University |
Keywords: Humanoid and Bipedal Locomotion, Biologically-Inspired Robots, Whole-Body Motion Planning and Control
Abstract: To implement successful bipedal locomotion in a robot, its center of pressure (CoP) is placed on a supporting area. To achieve locomotion, many studies have focused on the lower body. Given that the human upper body has a large mass and its behavior influences locomotion even in the case of the robot, this study investigates the effect of the upper body, which contains moving arms and a twisting trunk, on CoP. The dynamics is analyzed using a simple model that has a passive viscoelastic trunk joint around the vertical axis and arms that oscillates back and forth. From the derivations of CoP and trunk joint trajectory, three important findings are made: (i) the CoP oscillates along the lateral direction only when the arm swings in an anterior-posterior asymmetric manner, (ii) the trajectories of CoP and trunk joint are in anti-phase, (iii) the phase of CoP along the lateral direction is influenced by the swinging cycle and viscoelasticity of the trunk joint?the mechanical elements of the upper body. An experiment using a robot verified the first and second findings, and simulation verified the last finding. The last finding will contribute to making a feedback controller that converges to the desired phase, which is an important factor for successful bipedal locomotion.
|
|
09:00-09:06, Paper WeAT8.6 | Add to My Program |
An Implantable Variable Length Actuator for Modulating in Vivo Musculo-Tendon Force in a Bipedal Animal Model |
|
Thomas, Sean | Penn State University (PSU) |
Joshi, Ravin | Penn State University (PSU) |
Cheng, Bo | Pennsylvania State University |
Cheng, Huanyu | Penn State University (PSU) |
Aynardi, Michael C. | Penn State University (PSU) |
Sawicki, Gregory | Georgia Institute of Technology |
Rubenson, Jonas | Pennsylvania State University, Biomechanics Laboratory / Muscle |
Keywords: Humanoid and Bipedal Locomotion, Rehabilitation Robotics, Wearable Robotics
Abstract: Mobility, a critical factor in quality of life, is often rehabilitated using simplistic solutions, such as walkers. Exoskeletons (wearable robotics) offer a more sophisticated rehabilitation approach. However, non-adherence to externally worn mobility aids limits their efficacy. Here, we present the concept of a fully implantable assistive limb actuator that overcomes non-adherence constraints, and which can provide high-precision assistive force. In a bipedal animal model (fowl), we have developed a variable length isometric actuator (measuring 9 × 30 mm) that is able to be directly implanted within the leg via a bone anchor and tendon fixation, replacing the lateral gastrocnemius muscle belly. The actuator is able to generate isometric force similar to the in vivo force of the native muscle, designed to generate assistive torque at the ankle and reduce muscular demand at no additional energy cost. The device has a stroke of 10 mm that operates up to 770 mm/s (77 stroke lengths/s), capable of acting as a clutch (disengaging when needed) and with a tunable slack length to modulate the timing and level of assistive force during gait. Surgical techniques to attach the actuator to the biological system, the Achilles tendon and tibia, have been established and validated using survival surgeries and cadaveric specimens.
|
|
09:06-09:12, Paper WeAT8.7 | Add to My Program |
Whole-Body Torque Control without Joint Position Control Using Vibration-Suppressed Friction Compensation for Bipedal Locomotion of Gear-Driven Torque Sensorless Humanoid |
|
Hiraoka, Takuma | The University of Tokyo |
Sato, Shimpei | The University of Tokyo |
Hiraoka, Naoki | The University of Tokyo |
Tang, Annan | The University of Tokyo |
Kojima, Kunio | The University of Tokyo |
Okada, Kei | The University of Tokyo |
Inaba, Masayuki | The University of Tokyo |
Kawasaki, Koji | The University of Tokyo |
Keywords: Humanoid and Bipedal Locomotion, Force Control, Body Balancing
Abstract: Humanoids operate in repeated contact and non-contact with their environment and so the motion of humanoids such as walking on uneven terrain or in a narrow space requires the accurate force and position control. Joint torque control systems are suitable for position and force control, but are prone to friction and other modeling errors. To solve this problem, methods have been proposed to realize torque control in combination with joint position control systems or by improving joint structures such as sensors and actuators, but these methods have problems such as response delay and increased weight and volume. Thus, it is difficult to achieve motion of life-sized humanoids by whole-body torque control. In this paper, we solve challenges not with one specific layer, but rather with multiple layers that complement each other. We propose a hierarchical whole-body torque control method using four layers: friction compensation based on a vibration-suppressed model, whole-body resolved acceleration control using priority, center-of-gravity acceleration control based on foot-guided control, and landing position time correction based on capture point. We verify through walking experiments that the proposed methods can control the life-sized humanoid robot driven by high-reduction ratio joints by whole-body torque control without a torque sensor or joint position control, and that it enables the robot to move and even transport an object on outdoor uneven terrain.
|
|
09:12-09:18, Paper WeAT8.8 | Add to My Program |
An Approach for Generating Families of Energetically Optimal Gaits from Passive Dynamic Walking Gaits |
|
Rosa, Nelson | University of Stuttgart |
Katamish, Bassel | Technical University of Berlin |
Raff, Maximilian | University of Stuttgart |
Remy, C. David | University of Stuttgart |
Keywords: Passive Walking, Humanoid and Bipedal Locomotion, Optimization and Optimal Control
Abstract: For a class of biped robots with impulsive dynamics and a non-empty set of passive gaits (unactuated, periodic motions of the biped model), we present a method for computing continuous families of locally optimal gaits with respect to a class of commonly used energetic cost functions (e.g., the integral of torque-squared). We compute these families using only the passive gaits of the biped, which are globally optimal gaits with respect to these cost functions. Our approach fills in an important gap in the literature when computing a library of locally optimal gaits, which often do not make use of these globally optimal solutions as seed values. We demonstrate our approach on a well-studied two-link biped model.
|
|
09:18-09:24, Paper WeAT8.9 | Add to My Program |
Stair Climbing Using the Angular Momentum Linear Inverted Pendulum Model and Model Predictive Control |
|
Dosunmu-Ogunbi, Oluwami | University of Michigan |
Shrivastava, Aayushi | University of Michigan Ann Arbor |
Gibson, Grant | University of Michigan |
Grizzle, J.W | University of Michigan |
Keywords: Humanoid and Bipedal Locomotion
Abstract: A new control paradigm using angular momentum and foot placement as state variables in the linear inverted pendulum model has expanded the realm of possibilities for the control of bipedal robots. This new paradigm, known as the ALIP model, has shown effectiveness in cases where a robot's center of mass height can be assumed to be constant or near constant as well as in cases where there are no non-kinematic restrictions on foot placement. Walking up and down stairs violates both of these assumptions, where center of mass height varies significantly within a step and the geometry of the stairs restrict the effectiveness of foot placement. In this paper, we explore a variation of the ALIP model that allows the length of the virtual pendulum formed by the robot's stance foot and center of mass to follow smooth trajectories during a step. We couple this model with a control strategy constructed from a novel combination of virtual constraint-based control and a model predictive control algorithm to stabilize a stair climbing gait that does not soley rely on foot placement. Simulations on a 20-degree of freedom model of the Cassie biped in the SimMechanics simulation environment show that the controller is able to achieve periodic gait.
|
|
09:24-09:30, Paper WeAT8.10 | Add to My Program |
Real-Time Dynamic Bipedal Avoidance |
|
Wang, Tianze | Florida State University |
White, Jason | Florida State University |
Hubicki, Christian | Florida State University |
Keywords: Humanoid and Bipedal Locomotion, Motion and Path Planning, Collision Avoidance
Abstract: In real-world settings, bipedal robots must avoid collisions with people and their environment. Further, a biped can choose between modes of avoidance: (1) adjust its pose while standing or (2) step to gain maneuverability. We present a real-time motion planner and multibody control framework for dynamic bipedal robots that avoids multiple moving obstacles and automatically switches between standing and stepping modes as necessary. By leveraging a reduced-order model (i.e. Linear Inverted Pendulum Model) and a half-space representation of the safe region, the planner is formulated as a convex optimization problem (i.e. Quadratic Programming) that can be used for real-time application with Model-Predictive-Control (MPC). To facilitate mode switching, we introduce center-of-pressure related slack variables to the convex planning optimization that both shape the planning cost function and provide a mode switching criterion for dynamic locomotion. Finally, we implement the proposed algorithm on a 3D Cassie bipedal robot and present hardware experiments showing real-time bipedal standing avoidance, stepping avoidance, and automatic switching of avoidance modes.
|
|
09:30-09:36, Paper WeAT8.11 | Add to My Program |
Data-Driven Adaptation for Robust Bipedal Locomotion with Step-To-Step Dynamics |
|
Dai, Min | California Institute of Technology |
Xiong, Xiaobin | University of Wisconsin Madison |
Lee, Jaemin | California Institute of Technology |
Ames, Aaron | Caltech |
Keywords: Humanoid and Bipedal Locomotion, Whole-Body Motion Planning and Control, Robust/Adaptive Control
Abstract: This paper presents an emph{online} framework for synthesizing agile locomotion for bipedal robots that adapts to unknown environments, modeling errors, and external disturbances. To this end, we leverage step-to-step (S2S) dynamics which has proven effective in realizing dynamic walking on underactuated robots---assuming known dynamics and environments. This paper considers the case of uncertain models and environments and presents a data-driven representation of the S2S dynamics that can be learned via an adaptive control approach that is both data-efficient and easy to implement. The learned S2S controller generates desired discrete foot placement, which is then realized on the full-order dynamics of the bipedal robot by tracking desired outputs synthesized from the given foot placement. The benefits of the proposed approach are twofold. First, it improves the ability of the robot to walk at a given desired velocity when compared to the non-adaptive baseline controller. Second, the data-driven approach enables stable and agile locomotion under the effect of various unknown disturbances: additional unmodeled payload, large robot model errors, external disturbance forces, biased velocity estimation, and sloped terrains. This is demonstrated through in-depth evaluation with a high-fidelity simulation of the bipedal robot Cassie subject to the aforementioned disturbances.
|
|
09:36-09:42, Paper WeAT8.12 | Add to My Program |
Template Model Inspired Task Space Learning for Robust Bipedal Locomotion |
|
Castillo, Guillermo A. | The Ohio State University |
Weng, Bowen | The Ohio State University |
Yang, Shunpeng | Southern University of Science and Technology |
Zhang, Wei | Southern University of Science and Technology |
Hereid, Ayonga | Ohio State University |
Keywords: Humanoid and Bipedal Locomotion, Reinforcement Learning, Machine Learning for Robot Control
Abstract: This work presents a hierarchical framework for bipedal locomotion that combines a Reinforcement Learning (RL)-based high-level (HL) planner policy for the online generation of task space commands with a model-based low-level (LL) controller to track the desired task space trajectories. Different from traditional end-to-end learning approaches, our HL policy takes insights from the angular momentum-based linear inverted pendulum (ALIP) to carefully design the observation and action spaces of the Markov Decision Process (MDP). This simple yet effective design creates an insightful mapping between a low-dimensional state that effectively captures the complex dynamics of bipedal locomotion and a set of task space outputs that shape the walking gait of the robot. The HL policy is agnostic to the task space LL controller, which increases the flexibility of the design and generalization of the framework to other bipedal robots. This hierarchical design results in a learning-based framework with improved performance, data efficiency, and robustness compared with the ALIP model-based approach and state-of-the-art learning-based frameworks for bipedal locomotion. The proposed hierarchical controller is tested in three different robots, Rabbit, a five-link underactuated planar biped; Walker2D, a seven-link fully-actuated planar biped; and Digit, a 3D humanoid robot with 20 actuated joints. The trained policy naturally learns human-like locomotion behaviors and is able to effectively track a wide range of walking speeds while preserving the robustness and stability of the walking gait even under adversarial conditions.
|
|
09:42-09:48, Paper WeAT8.13 | Add to My Program |
Overtaking Moving Obstacles with Digit: Path Following for Bipedal Robots Via Model Predictive Contouring Control |
|
Narkhede, Kunal Sanjay | University of Delaware |
Thanki, Dhruv Ashwinkumar | University of Delaware |
Kulkarni, Abhijeet Mangesh | University of Delaware |
Poulakakis, Ioannis | University of Delaware |
Keywords: Humanoid and Bipedal Locomotion, Legged Robots
Abstract: Humanoid robots are expected to navigate in changing environments and perform a variety of tasks. Frequently, these tasks require the robot to make decisions online regarding the speed and precision of following a reference path. For example, a robot may want to decide to temporarily deviate from its path to overtake a slowly moving obstacle that shares the same path and is ahead. In this case, path following performance is compromised in favor of fast path traversal. Available global trajectory tracking approaches typically assume a given--specified in advance--time parametrization of the path and seek to minimize the norm of the Cartesian error. As a result, when the robot should be where on the path is fixed and temporary deviations from the path are strongly discouraged. Given a global path, this paper presents a Model Predictive Contouring Control (MPCC) approach to selecting footsteps that maximize path traversal while simultaneously allowing the robot to decide between faithful versus fast path following. The method is evaluated in high-fidelity simulations of the bipedal robot Digit in terms of tracking performance of curved paths under disturbances and is also applied to the case where Digit overtakes a moving obstacle.
|
|
WeAT9 Regular session, 142ABC |
Add to My Program |
Formal Methods and Planning |
|
|
Chair: Figat, Maksym | Warsaw University of Technology |
Co-Chair: Simaan, Nabil | Vanderbilt University |
|
08:30-08:36, Paper WeAT9.1 | Add to My Program |
Synthesis of Robotic System Controllers Using Robotic System Specification Language |
|
Figat, Maksym | Warsaw University of Technology |
Zieliński, Cezary | Institute of Control and Computation Engineering, Warsaw Univers |
Keywords: Control Architectures and Programming, Methods and Tools for Robot System Design, Petri Nets for Automation Control
Abstract: Robotic System Specification Language (RSSL) stems from the embodied agent approach to robotic system design. It enables the specification of both the structure and activities of a multi-robot multi-agent robotic system. RSSL specification can be verified and automatically transformed by its compiler into a six-layered Robotic System Hierarchical Petri Net (RSHPN). RSHPN models the activities and structure of the designed robotic system. The automatically generated RSHPN is loaded into RSHPN Tool modeling RSHPNs and automatically generating the controller code. This approach was validated on several robotic systems. The use of RSSL and RSHPN facilitates the synthesis of robotic system controllers.
|
|
08:36-08:42, Paper WeAT9.2 | Add to My Program |
Overcoming Exploration: Deep Reinforcement Learning for Continuous Control in Cluttered Environments from Temporal Logic Specifications |
|
Mingyu, Cai | Lehigh University |
Aasi, Erfan | Boston University |
Belta, Calin | Boston University |
Vasile, Cristian Ioan | Lehigh University |
Keywords: Formal Methods in Robotics and Automation, Reinforcement Learning, Machine Learning for Robot Control
Abstract: Model-free continuous control for robot navigation tasks using Deep Reinforcement Learning (DRL) that relies on noise policies to explore data for optimization is sensitive to the density of rewards. In practice, robots are usually deployed in cluttered environments, containing dense obstacles and narrow passageways, where designing dense effective rewards is challenging, resulting in exploration issues during training. Such a problem becomes even more serious when pre-defined tasks have a complex temporal and logic structure. This work presents a deep policy gradient algorithm for a task-guided robot with unknown dynamics and focuses on two aspects i.e., cluttered environments and rich high-level tasks. To overcome the environmental challenge of exploration during training, we propose a novel path planning-guided reward scheme by integrating sampling-based methods, to effectively complete goal-reaching missions. We use Linear Temporal Logic (LTL) is used to express rich robotic specifications. To facilitate LTL satisfaction, our approach decomposes the LTL mission into sub-goal-reaching tasks that are solved in a distributed manner. Our framework is shown to significantly improve performance (effectiveness, efficiency) and exploration of robots tasked with complex missions in large-scale cluttered environments. A video demonstration can be found on YouTube Channel: https://youtu.be/YQRQ2-yMtIk.
|
|
08:42-08:48, Paper WeAT9.3 | Add to My Program |
STL: Surprisingly Tricky Logic (for System Validation) |
|
Siu, Ho Chit | Massachusetts Institute of Technology |
Leahy, Kevin | MIT Lincoln Laboratory |
Mann, Makai | MIT Lincoln Laboratory |
Keywords: Acceptability and Trust, Formal Methods in Robotics and Automation, Design and Human Factors
Abstract: Much of the recent work developing formal methods techniques to specify or learn the behavior of autonomous systems is predicated on a belief that formal specifications are interpretable and useful for humans when checking systems. Though frequently asserted, this assumption is rarely tested. We performed a human experiment (N = 62) with a mix of people who were and were not familiar with formal methods beforehand, asking them to validate whether a set of signal temporal logic (STL) constraints would keep an agent out of harm and allow it to complete a task in a gridworld capture-the-flag setting. Validation accuracy was 45 +/- 20% (mean +/- standard deviation). The ground-truth validity of a specification, subjects' familiarity with formal methods, and subjects' level of education were found to be significant factors in determining validation correctness. Participants exhibited an affirmation bias, causing significantly increased accuracy on valid specifications, but significantly decreased accuracy on invalid specifications. Additionally, participants, particularly those familiar with formal methods, tended to be overconfident in their answers, and be similarly confident regardless of actual correctness. Our data do not support the belief that formal specifications are inherently human-interpretable to a meaningful degree for system validation. We recommend ergonomic improvements to data presentation and validation training, which should be tested before claims of interpretability make their way back into the formal methods literature.
|
|
08:48-08:54, Paper WeAT9.4 | Add to My Program |
Real-Time RRT* with Signal Temporal Logic Preferences |
|
Linard, Alexis | KTH Royal Institute of Technology |
Torre, Ilaria | Chalmers University of Technology |
Bartoli, Ermanno | KTH Royal Institute of Technology |
Sleat, Alexander | KTH Royal University of Technology |
Leite, Iolanda | KTH Royal Institute of Technology |
Tumova, Jana | KTH Royal Institute of Technology |
Keywords: Formal Methods in Robotics and Automation, Motion and Path Planning
Abstract: Signal Temporal Logic (STL) is a rigorous specification language that allows one to express various spatio-temporal requirements and preferences. Its semantics (called robustness) allows quantifying to what extent are the STL specifications met. In this work, we focus on enabling STL constraints and preferences in the Real-Time Rapidly Exploring Random Tree (RT-RRT*) motion planning algorithm in an environment with dynamic obstacles. We propose a cost function that guides the algorithm towards the asymptotically most robust solution, i.e. a plan that maximally adheres to the STL specification. In experiments, we applied our method to a social navigation case, where the STL specification captures spatio-temporal preferences on how a mobile robot should avoid an incoming human in a shared space. Our results show that our approach leads to plans adhering to the STL specification, while ensuring efficient cost computation.
|
|
08:54-09:00, Paper WeAT9.5 | Add to My Program |
Sensor Selection for Fine-Grained Behavior Verification That Respects Privacy |
|
Phatak, Rishi | Texas A&M University |
Shell, Dylan | Texas A&M University |
Keywords: Formal Methods in Robotics and Automation, Discrete Event Dynamic Automation Systems, Reactive and Sensor-Based Planning
Abstract: A useful capability is that of classifying some agent’s behavior using data from a sequence, or trace, of sensor measurements. The sensor selection problem involves choosing a subset of available sensors to ensure that, when generated, observation traces will contain enough information to determine whether the agent’s activities match some pattern. In generalizing prior work, this paper studies a formulation in which multiple behavioral itineraries may be supplied, with sensors selected to distinguish between behaviors. This allows one to pose finegrained questions, e.g., to position the agent’s activity on a spectrum. In addition, with multiple itineraries, one can also ask about choices of sensors where some behavior is always plausibly concealed by (or mistaken for) another. Using sensor ambiguity to limit the acquisition of knowledge is a strong privacy guarantee, a form of guarantee which some earlier work examined under formulations distinct from our inter-itinerary conflation approach. By concretely formulating privacy requirements for sensor selection, this paper connects both lines of work in a novel fashion: privacy—where there is a bound from above, and behavior verification—where sensors choices are bounded from below. We examine the worst-case computational complexity that results from both types of bounds, proving that upper bounds are more challenging under standard computational complexity assumptions. The problem is intractable in general, but we introduce an approach to solving this problem that can exploit interrelationships between constraints, and identify opportunities for optimizations. Case studies are presented to demonstrate the usefulness and scalability of our proposed solution, and to assess the impact of the optimizations.
|
|
09:00-09:06, Paper WeAT9.6 | Add to My Program |
An Interactive System for Multiple-Task Linear Temporal Logic Path Planning |
|
Chen, Yizhou | Chinese University of Hong Kong |
Wang, Xinyi | The Chinese University of Hong Kong |
Guo, Zixuan | The Chinese University of Hong Kong |
Wang, Ruoyu | The Chinese University of Hong Kong |
Zhou, Xunkuai | Tongji University |
Yang, Guidong | The Chinese University of Hong Kong |
Lai, Shupeng | National University of Singapore |
Chen, Ben M. | Chinese University of Hong Kong |
Keywords: Formal Methods in Robotics and Automation, Software-Hardware Integration for Robot Systems, Task and Motion Planning
Abstract: Beyond programming robots to accomplish a single high-level task at a time, people also hope robots follow instructions and complete a series of tasks while meeting their requirements. This paper presents an interactive software system that consists of a multiple-task linear temporal logic (LTL) path planner and a human-machine interface (HMI). The HMI transforms human oral instructions into task commands that can be understood by the machine. The planner grows a rapid random exploring tree to search for solutions for multiple tasks. When switching tasks, the search tree is re-initialized and reconnected to utilize the information gathered during the exploration of the workspace. The feasibility of the improved planner is theoretically guaranteed, and profiling in simulation shows an acceleration in planning. An experiment with a quadcopter is conducted to show that the combination of the multiple-task LTL planner and the HMI results in a synergistic effect in real-world applications.
|
|
09:06-09:12, Paper WeAT9.7 | Add to My Program |
Temporal Logic-Based Intent Monitoring for Mobile Robots |
|
Yoon, Hansol | University of Colorado Boulder |
Sankaranarayanan, Sriram | University of Colorado, Boulder |
Keywords: Formal Methods in Robotics and Automation, Intention Recognition, Autonomous Agents
Abstract: We propose a framework that uses temporal logic specifications to predict and monitor the intent of a robotic agent through passive observations of its actions over time. Our approach uses a set of possible hypothesized intents specified as Buchi automata, obtained from translating temporal logic formulae. Based on observing the actions of the robot, we update the probabilities of each hypothesis using Bayes rule. Observations of robot actions provide strong evidence for its "immediate" short-term goals, whereas temporal logic specifications describe behaviors over a "never-ending" infinite time horizon. To bridge this gap, we use a two-level hierarchical monitoring approach. At the lower level, we track the immediate short-term goals of the robot which are modeled as atomic propositions in the temporal logic formalism. We apply our approach to predicting intent of human workers and thus their movements in an indoor space based on the publicly available THOR dataset. We show how our approach correctly labels each agent with their appropriate intents after relatively few observations while predicting their future actions accurately over longer time horizons.
|
|
09:12-09:18, Paper WeAT9.8 | Add to My Program |
Evaluation Metrics of Object Detection for Quantitative System-Level Analysis of Safety-Critical Autonomous Systems |
|
Badithela, Apurva | Caltech |
Wongpiromsarn, Tichakorn | Iowa State University |
Murray, Richard | California Institute of Technology |
Keywords: Formal Methods in Robotics and Automation, Object Detection, Segmentation and Categorization, Probability and Statistical Methods
Abstract: This paper proposes two metrics for evaluating learned object detection models: the emph{proposition-labeled} and emph{distance-parametrized} confusion matrices. These metrics are leveraged to quantitatively analyze the system with respect to its system-level formal specifications via probabilistic model checking. In particular, we derive transition probabilities from these confusion matrices to compute the probability that the closed-loop system satisfies its system-level specifications expressed in temporal logic. Instead of using object class labels, the proposition-labeled confusion matrix uses atomic propositions relevant to the high-level planning strategy. Furthermore, unlike the traditional confusion matrix, the proposed distance-parametrized confusion matrix accounts for variations in detection performance with respect to the distance between the ego and the object. Empirically, these evaluation metrics, chosen by considering system-level specifications and planning module design, result in less conservative system-level evaluations than those from traditional confusion matrices. We demonstrate this framework on a car-pedestrian example by computing the satisfaction probabilities for safety requirements formalized in Linear Temporal Logic.
|
|
09:18-09:24, Paper WeAT9.9 | Add to My Program |
Energy-Aware Planning of Heterogeneous Multi-Agent Systems for Serving Cooperative Tasks with Temporal Logic Specifications |
|
Buyukkocak, Ali Tevfik | University of Minnesota |
Aksaray, Derya | Northeastern University |
Yazicioglu, Yasin | Northeastern University |
Keywords: Formal Methods in Robotics and Automation, Multi-Robot Systems, Path Planning for Multiple Mobile Robots or Agents
Abstract: We address a coordination problem for a team of heterogeneous and energy-limited agents to achieve cooperative tasks given as team-level spatio-temporal specifications. We assume that agents have stochastic energy dynamics and do not have identical capabilities. We define the team-level specification using Signal Temporal Logic (STL) with integral predicates, which can express tasks that can be completed collectively in an asynchronous way. We first abstract the environment as a graph using sampling-based methods. This abstraction includes the possible paths of different types of agents and ensures the availability of recharging within a certain distance. Then, we formulate a mixed-integer program over this abstraction to find the high-level paths of the agents. Finally, we steer the agents in the environment according to the nominal plan under stochastic energy consumption models and a recharging policy. Such stochastic energy dynamics cause deviations from the nominal plan and delays in completing the tasks. Accordingly, we define and evaluate the expected delay (temporal relaxation) in achieving the STL specification under the proposed solution.
|
|
09:24-09:30, Paper WeAT9.10 | Add to My Program |
Efficient Symbolic Approaches for Quantitative Reactive Synthesis with Finite Tasks |
|
Muvvala, Karan | University of Colorado Boulder |
Lahijanian, Morteza | University of Colorado Boulder |
Keywords: Formal Methods in Robotics and Automation, Hybrid Logical/Dynamical Planning and Verification, Task Planning
Abstract: This work introduces efficient symbolic algorithms for quantitative reactive synthesis. We consider resource-constrained robotic manipulators that need to interact with a human to achieve a complex task expressed in linear temporal logic. Our framework generates reactive strategies that not only guarantee task completion but also seek cooperation with the human when possible. We model the interaction as a two-player game and consider regret-minimizing strategies to encourage cooperation. We use symbolic representation of the game to enable scalability. For synthesis, we first introduce value iteration algorithms for such games with min-max objectives. Then, we extend our method to the regret-minimizing objectives. Our benchmarks reveal that our the symbolic framework not only significantly improves computation time (up to an order of magnitude) but also can scale up to much larger instances of manipulation problems with up to 2x number of objects and locations than the state of the art.
|
|
09:30-09:36, Paper WeAT9.11 | Add to My Program |
Minimal Path Violation Problem with Application to Fault Tolerant Motion Planning of Manipulators |
|
Upadhyay, Aakriti | University at Albany, SUNY |
Ghosh, Mukulika | Missouri State University |
Ekenna, Chinwe | University at Albany |
Keywords: Formal Methods in Robotics and Automation, Motion and Path Planning, Failure Detection and Recovery
Abstract: Failure of any component in a robotic system during operation is a critical concern, and it is essential to address such incidents promptly. This work investigates a novel technique to recover from failures or changes in the configuration space while avoiding expensive re-computation or re-planning. To achieve this, we present a novel concept, Minimal Path Violation (MPV), that aids in finding the best feasible path by performing the fewest re-configurations from the possibly infeasible straight-line path connecting start and goal configurations. This work builds on our previous work that approximates the topological and geometric properties of the configuration space to provide coarsely-diverse pathways. The algorithm sorts the diverse paths using the path cost, nodes' visibility, and edges' expansiveness. The ranking measure and the MPV guide the planning algorithm to find the best alternate route in the changed configuration space without re-planning from scratch. We perform experiments with articulated 3 to 28 DOF (degrees of freedom) robots ranging from serial linkage robots, Kuka YouBots, and PR2 robots. Our results show that our method outperforms existing optimal planners in computation time, total nodes, and path quality while preserving route feasibility in changed configuration space.
|
|
09:36-09:42, Paper WeAT9.12 | Add to My Program |
Reinforcement Learning under Probabilistic Spatio-Temporal Constraints with Time Windows |
|
Lin, Xiaoshan | University of Minnesota, Twin Cities |
Koochakzadeh, Abbasali | Purdue University |
Yazicioglu, Yasin | Northeastern University |
Aksaray, Derya | Northeastern University |
Keywords: Formal Methods in Robotics and Automation, Reinforcement Learning
Abstract: We propose an automata-theoretic approach for reinforcement learning (RL) under complex spatio-temporal constraints with time windows. The problem is formulated using a Markov decision process under a bounded temporal logic constraint. Different from existing RL methods that can eventually learn optimal policies satisfying such constraints, our proposed approach enforces a desired probability of constraint satisfaction throughout learning. This is achieved by translating the bounded temporal logic constraint into a total automaton and avoiding ``unsafe" actions based on the available prior information regarding the transition probabilities, i.e., a pair of upper and lower bounds for each transition probability. We provide theoretical guarantees on the resulting probability of constraint satisfaction. We also provide numerical results in a scenario where a robot explores the environment to discover high-reward regions while fulfilling some periodic pick-up and delivery tasks that are encoded as temporal logic constraints.
|
|
09:42-09:48, Paper WeAT9.13 | Add to My Program |
Lie Group Formulation and Sensitivity Analysis for Shape Sensing of Variable Curvature Continuum Robots with General String Encoder Routing (I) |
|
Orekhov, Andrew | Carnegie Mellon University |
Ahronovich, Elan | Vanderbilt ARMA |
Simaan, Nabil | Vanderbilt University |
Keywords: Kinematics, Flexible Robots, Tendon/Wire Mechanism, Lie group kinematics
Abstract: This article considers a combination of actuation tendons and measurement strings to achieve accurate shape sensing and direct kinematics of continuum robots. Assuming general string routing, a methodical Lie group formulation for the shape sensing of these robots is presented. The shape kinematics is expressed using arc-length-dependent curvature distributions parameterized by modal functions, and the Magnus expansion for Lie group integration is used to express the shape as a product of exponentials. The tendon and string length kinematic constraints are solved for the modal coefficients and the configuration space and body Jacobian are derived. The noise amplification index for the shape reconstruction problem is defined and used for optimizing the string/tendon routing paths, and a planar simulation study shows the minimal number of strings/tendons needed for accurate shape reconstruction. A torsionally stiff continuum segment is used for experimental evaluation, demonstrating mean (maximal) end-effector absolute position error of less than 2% (5%) of total length. Finally, a simulation study of a torsionally compliant segment demonstrates the approach for general deflections and string routings. We believe that the methods of this article can benefit the design process, sensing, and control of continuum and soft robots.
|
|
WeAT10 Regular session, 250ABC |
Add to My Program |
Dexterous Manipulation |
|
|
Chair: Yamane, Katsu | Path Robotics Inc |
Co-Chair: Hermans, Tucker | University of Utah |
|
08:30-08:36, Paper WeAT10.1 | Add to My Program |
Planning Visual-Tactile Precision Grasps Via Complementary Use of Vision and Touch |
|
Matak, Martin | University of Utah |
Hermans, Tucker | University of Utah |
Keywords: Multifingered Hands, Grasping, Deep Learning in Grasping and Manipulation
Abstract: Reliably planning fingertip grasps for multi-fingered hands lies as a key challenge for many tasks including tool use, insertion, and dexterous in-hand manipulation. This task becomes even more difficult when the robot lacks an accurate model of the object to be grasped. Tactile sensing offers a promising approach to account for uncertainties in object shape. However, current robotic hands tend to lack full tactile coverage. As such, a problem arises of how to plan and execute grasps for multi-fingered hands such that contact is made with the area covered by the tactile sensors. To address this issue, we propose an approach to grasp planning that explicitly reasons about where the fingertips should contact the estimated object surface while maximizing the probability of grasp success. Key to our method's success is the use of visual surface estimation for initial planning to encode the contact constraint. The robot then executes this plan using a tactile-feedback controller that enables the robot to adapt to online estimates of the object's surface to correct for errors in the initial plan. Importantly, the robot never explicitly integrates object pose or surface estimates between visual and tactile sensing, instead it uses the two modalities in complementary ways. Vision guides the robot's motion prior to contact; touch updates the plan when contact occurs differently than predicted from vision. We show that our method successfully synthesises and executes precision grasps for previously unseen objects using surface estimates from a single camera view. Further, our approach outperforms a state of the art multi-fingered grasp planner, while also beating several baselines we propose.
|
|
08:36-08:42, Paper WeAT10.2 | Add to My Program |
A Unified Trajectory Generation Algorithm for Dynamic Dexterous Manipulation |
|
Zhou, Cheng | Tencent |
Gao, Wentao | University of Bristol |
Weifeng, Lu | The City University of Hong Kong |
Long, Yanbo | University of Bristol |
Yang, Sicheng | Tencent |
Zhao, Longfei | TENCENT |
Huang, Bidan | Tencent |
Zheng, Yu | Tencent |
Keywords: Dexterous Manipulation, Grasping, Manipulation Planning
Abstract: This paper proposes a novel efficient multi-phase trajectory generation algorithm for dynamic dexterous manipulation tasks such as throwing, catching, dynamic regrasping, and dynamic handover. The manipulation tasks can be decomposed into multiple manipulation primitives, such as approaching, separating, rolling, sticking, grasping, and colliding. Each manipulation primitive can be formulated as a free-terminal optimal control problem (OCP) consisting of the optimal pose(position and attitude) trajectory of the object and the robot, the pose and force linkage constraints between the object and the robot, and the expect force maintenance on the contact point. The single-arm regrasping task and dual- arm dynamic handover task are conducted to demostrate the effectiveness of the presented algorithm
|
|
08:42-08:48, Paper WeAT10.3 | Add to My Program |
Hybrid Learning and Model-Based Planning and Control of In-Hand Manipulation |
|
Soltani Zarrin, Rana | Honda Research Institute - USA |
Jitosho, Rianna | Stanford University |
Yamane, Katsu | Path Robotics Inc |
Keywords: In-Hand Manipulation, Multifingered Hands, Motion Control
Abstract: This paper presents a hierarchical framework for planning and control of in-hand manipulation of a rigid object involving grasp changes using fully-actuated multi-fingered robotic hands. While the framework can be applied to the general dexterous manipulation, we focus on a more complex definition of in-hand manipulation, where at the goal pose the hand has to reach a grasp suitable for using the object as a tool. The high level planner determines the object trajectory as well as the grasp changes, i.e. adding, removing, or sliding fingers, to be executed by the low-level controller. While the grasp sequence is planned online by a learning- based policy to adapt to variations, the trajectory planner and the low-level controller for object tracking and contact force control are exclusively model-based to robustly realize the plan. By infusing the knowledge about the physics of the problem and the low-level controller into the grasp planner, it learns to successfully generate grasps similar to those generated by model-based optimization approaches, obviating the high computation cost of online running of such methods to account for variations. By performing experiments in physics simulation for realistic tool use scenarios, we show the success of our method on different tool-use tasks and dexterous hand models. Additionally, we show that this hybrid method offers more robustness to trajectory and task variations compared to a model-based method.
|
|
08:48-08:54, Paper WeAT10.4 | Add to My Program |
Vision-Based In-Hand Manipulation of Variously Shaped Objects Via Contact Point Prediction |
|
Isobe, Yuzuka | Chuo University |
Kang, Sunhwi | Panasonic Connect Co., Ltd |
Shimamoto, Takeshi | Panasonic Connect Co., Ltd |
Matsuyama, Yoshinari | Panasonic Connect Co., Ltd |
Pathak, Sarthak | Chuo University |
Umeda, Kazunori | Chuo University |
Keywords: In-Hand Manipulation, Visual Servoing, Perception for Grasping and Manipulation
Abstract: In-hand manipulation (IHM) is an important ability for robotic hands. This ability refers to changing the position and orientation of a grasped object without dropping it from the hand workspace. One major challenge of IHM is to achieve a large range of manipulation (especially rotation), regardless of the shape, size, and the orientation during manipulation of the grasped object. There are two main challenges - the manipulation range (due to the range of motion of the hand) and keeping the object grasped under all shapes and orientations. Specifically, even when the contact points between the hand and the object switch and the positions of these points change due to its shape and changing orientation, constant grasp of the object is required. This paper presents an IHM method for a robotic hand with belts, based on the prediction of the contact-point changes via image information. The focus is on a robotic hand that has a two-fingered parallel gripper with conveyor belts which can continuously manipulate an object through a large range. A stereo camera is attached to the hand. First, the contour of the grasped object is acquired from the camera. From the contour, the switching of the contact points between the surfaces of the belts and the object is predicted. Then, the positions of the contact points in the next frame are estimated by rotating the contour. The velocities of the belts are calculated based on the prediction of the switching. The fingers are controlled to follow the estimated positions of the contact points, via a feed-forward control. The effectiveness of the proposed method is verified through in-hand manipulation experiments for 22 objects of various shapes and sizes.
|
|
08:54-09:00, Paper WeAT10.5 | Add to My Program |
Object Manipulation through Contact Configuration Regulation: Multiple and Intermittent Contacts |
|
Taylor, Orion | MIT |
Doshi, Neel | MIT |
Rodriguez, Alberto | Massachusetts Institute of Technology |
Keywords: Dexterous Manipulation, Contact Modeling, Compliance and Impedance Control
Abstract: In this work, we build on our method for manipulating unknown objects via contact configuration regulation: the estimation and control of the location, geometry, and mode of all contacts between the robot, object, and environment. We further develop our estimator and controller to enable manipulation through more complex contact interactions, including intermittent contact between the robot/object, and multiple contacts between the object/environment. In addition, we support a larger set of contact geometries at each interface. This is accomplished through a factor graph based estimation framework that reasons about the complementary kinematic and wrench constraints of contact to predict the current contact configuration. We are aided by the incorporation of a limited amount of visual feedback; which when combined with the available F/T sensing and robot proprioception, allows us to differentiate contact modes that were previously indistinguishable. We implement this revamped framework on our manipulation platform, and demonstrate that it allows the robot to perform a wider set of manipulation tasks. This includes, using a wall as a support to re-orient an object, or regulating the contact geometry between the object and the ground. Finally, we conduct ablation studies to understand the contributions from visual and tactile feedback in our manipulation framework. Our code can be found at: https://github.com/mcubelab/pbal.
|
|
09:00-09:06, Paper WeAT10.6 | Add to My Program |
Non-Parametric Self-Identification and Model Predictive Control of Dexterous In-Hand Manipulation |
|
Chanrungmaneekul, Podshara | Rice University |
Ren, Kejia | Rice University |
Grace, Joshua | Yale University |
Dollar, Aaron | Yale University |
Hang, Kaiyu | Rice University |
Keywords: In-Hand Manipulation, Dexterous Manipulation
Abstract: Building hand-object models for dexterous in-hand manipulation remains a crucial and open problem. Major challenges include the difficulty of obtaining the geometric and dynamical models of the hand, object, and time-varying contacts, as well as the inevitable physical and perception uncertainties. Instead of building accurate models to map between the actuation inputs and the object motions, this work proposes to enable the hand-object systems to continuously approximate their local models via a process called self-identification. With a very small number of data points, as opposed to most data-driven methods, our system self-identifies the underlying manipulation models online through exploratory actions and non-parametric learning. By integrating the self-identified hand-object model into a model predictive control framework, the proposed system closes the control loop to provide high accuracy in-hand manipulation. Furthermore, the proposed self-identification is able to adaptively trigger online updates through additional exploratory actions, as soon as the self-identified local models render large discrepancies against the observed manipulation outcomes. We implemented the proposed approach on a sensorless underactuated Yale Model O hand with a single external camera to observe the object’s motion. With extensive experiments, we show that self-identification can enable accurate and robust dexterous manipulation without requiring an accurate system model nor a large amount of data for offline training.
|
|
09:06-09:12, Paper WeAT10.7 | Add to My Program |
In-Hand Cube Reconfiguration: Simplified |
|
Patidar, Sumit | Technical University of Berlin |
Sieler, Adrian | Technische Universitaet Berlin |
Brock, Oliver | Technische Universität Berlin |
Keywords: In-Hand Manipulation
Abstract: We present a simple approach to in-hand cube reconfiguration. By simplifying planning, control, and perception as much as possible, while maintaining robust and general performance, we gain insights into the inherent complexity of in-hand cube reconfiguration. We also demonstrate the effectiveness of combining GOFAI-based planning with the exploitation of environmental constraints and inherently compliant end-effectors in the context of dexterous manipulation. The proposed system outperforms a substantially more complex system for cube reconfiguration based on deep learning and accurate physical simulation, contributing arguments to the discussion about what the most promising approach to general manipulation might be. Project website: https://rbo.gitlab-pages.tu-berlin.de/robotics/simpleIHM/
|
|
09:12-09:18, Paper WeAT10.8 | Add to My Program |
Dexterous Soft Hands Linearize Feedback-Control for In-Hand Manipulation |
|
Sieler, Adrian | Technische Universitaet Berlin |
Brock, Oliver | Technische Universität Berlin |
Keywords: In-Hand Manipulation, Compliance and Impedance Control
Abstract: This paper presents a feedback-control framework for in-hand manipulation (IHM) with dexterous soft hands that enables the acquisition of manipulation skills in the real-world within minutes. We choose the deformation state of the soft hand as the control variable. To control for a desired deformation state, we use coarsely approximated Jacobians of the actuation-deformation dynamics. These Jacobian are obtained via explorative actions. This is enabled by the self-stabilizing properties of compliant hands, which allow us to use linear feedback control in the presence of complex contact dynamics. To evaluate the effectiveness of our approach, we show the generalization capabilities for a learned manipulation skill to variations in object size by 100 %, 360 degree changes in palm inclination and to disabling up to 50 % of the involved actuators. In addition, complex manipulations can be obtained by sequencing such feedback-skills.
|
|
09:18-09:24, Paper WeAT10.9 | Add to My Program |
In-Hand Manipulation of Unknown Objects with Tactile Sensing for Insertion |
|
Lepert, Marion | Stanford University |
Pan, Chaoyi | Tsinghua University |
Yuan, Shenli | SRI International |
Antonova, Rika | Stanford University |
Bohg, Jeannette | Stanford University |
Keywords: In-Hand Manipulation, Reactive and Sensor-Based Planning, Perception for Grasping and Manipulation
Abstract: In this paper, we present a method to manipulate unknown objects in-hand using tactile sensing without relying on a known object model. In many cases, vision-only approaches may not be feasible; for example, due to occlusion in cluttered spaces. We address this limitation by introducing a method to reorient unknown objects using tactile sensing. It incrementally builds a probabilistic estimate of the object shape and pose during task-driven manipulation. Our approach uses Bayesian optimization to balance exploration of the global object shape with efficient task completion. To demonstrate the effectiveness of our method, we apply it to a simulated Tactile-Enabled Roller Grasper, a gripper that rolls objects in hand while collecting tactile data. We evaluate our method on an insertion task with randomly generated objects and find that it reliably reorients objects while significantly reducing the exploration time.
|
|
09:24-09:30, Paper WeAT10.10 | Add to My Program |
Bi-Manual Robot Shoe Lacing |
|
Luo, Haining | Imperial College London |
Demiris, Yiannis | Imperial College London |
Keywords: Bimanual Manipulation, Dexterous Manipulation, Dual Arm Manipulation
Abstract: Shoe lacing (SL) is a challenging sensorimotor task in daily life and a complex engineering problem in the shoe-making industry. In this paper, we propose a system for autonomous SL. It contains a mathematical definition of the SL task and searches for the best lacing pattern corresponding to the shoe configuration and the user preference. We propose a set of action primitives and generate plans of action sequences according to the designed pattern. Our system plans the trajectories based on the perceived position of the eyelets and aglets with an active perception strategy, and deploys the trajectories on a bi-manual robot. Experiments demonstrate that the proposed system can successfully lace 3 different shoes in different configurations, with a completion rate of 92.0%, 91.6% and 77.5% for 6, 8 and 10-eyelet patterns respectively. To the best of our knowledge, this is the first demonstration of autonomous SL using a bi-manual robot.
|
|
09:30-09:36, Paper WeAT10.11 | Add to My Program |
Hand Design Approach for Planar Fully Actuated Manipulators |
|
Nave, Keegan | Oregon State University |
DuFrene, Kyle | Oregon State University |
Swenson, Nigel | Oregon State University |
Balasubramanian, Ravi | Oregon State University |
Grimm, Cindy | Oregon State University |
Keywords: In-Hand Manipulation, Performance Evaluation and Benchmarking, Dexterous Manipulation
Abstract: Robotic in-hand manipulation increases the capability of robotic hands to interact with the world. The amount of manipulation that a robot is capable of is highly dependent on the design of the robot hand, and previous works have shown success in designing hands to improve performance for different types of grasping and manipulation. In this paper we present a method for designing a fully-actuated planar manipulator that optimizes for specific in-hand motions. We demonstrate that, with the Asterisk Benchmark and a light-weight IK controller, we can translate our results from simulation to the real world with minimal effort and high-fidelity. Using the simulated data (over 4,000 simulated hand-designs) we begin to analyze which features contribute to improved planar manipulation.
|
|
09:36-09:42, Paper WeAT10.12 | Add to My Program |
Dynamic Finger Gaits Via Pivoting and Adapting Contact Forces |
|
Xue, Yuechuan | Amazon.com |
Tang, Ling | Iowa State University |
Jia, Yan-Bin | Iowa State University |
Keywords: Dexterous Manipulation, In-Hand Manipulation, Multifingered Hands
Abstract: For over three decades, finger gaiting has remained largely a subject for theoretical inquiries. Successful execution of a sequence of finger gaits does not simply reduce to planning collision-free paths for the involved fingers. A major issue is how to move the gaiting finger without losing the finger contacts with the object, which will most likely undergo a motion as the contact forces need to be adapted during the gait. This paper focuses on a single finger gait executed on a tool by an anthropomorphic hand driven by an arm. To improve stability, the tool's tip is leveraged as a pivot on the supporting plane. The gait consists of three stages: removal, during which the contact force on the gaiting finger gradually decreases to zero; relocation, during which the finger follows a pre-planned path (relative to the moving object) to establish a new contact; and addition, during which the contact force on the relocated finger increases to some desired level. Hybrid position/impedance control employs reference finger forces that satisfy the friction cone constraints and are dynamically consistent with the object's motion, which in turn provides reference poses for the fingertips to maintain their contacts during the gait. Finger gaits have been demonstrated on a kitchen knife and a screwdriver with an Adept SCARA robot and a Shadow Dexterous Hand.
|
|
09:42-09:48, Paper WeAT10.13 | Add to My Program |
Rotating Objects Via In-Hand Pivoting Using Vision, Force and Touch |
|
Xu, Shiyu | Monash University |
Liu, Tianyuan | University of Melbourne |
Wong, Michael | Monash University |
Kulic, Dana | Monash University |
Cosgun, Akansel | Monash University |
Keywords: In-Hand Manipulation, Sensor-based Control, Force and Tactile Sensing
Abstract: We propose a robotic manipulation method that can pivot objects on a surface using vision, wrist force and tactile sensing. We aim to control the rotation of an object around the grip point of a parallel gripper by allowing rotational slip, while maintaining a desired wrist force profile. Our approach runs an end-effector position controller and a gripper width controller concurrently in a closed loop. The position controller maintains a desired force using vision and wrist force. The gripper controller uses tactile sensing to keep the grip firm enough to prevent translational slip, but loose enough to allow rotational slip. Our sensor-based control approach relies on matching a desired force profile derived from object dimensions and weight, as well as vision-based monitoring of the object pose. The gripper controller uses tactile sensors to detect and prevent translational slip by tightening the grip when needed. Experimental results where the robot was tasked with rotating cuboid objects 90 degrees show that the multi-modal pivoting approach was able to rotate the objects without causing lift or translational slip, and was more energy-efficient compared to using a single sensor modality or pick-and-place.
|
|
09:48-09:54, Paper WeAT10.14 | Add to My Program |
Functional Grasp Transfer across a Category of Objects from Only One Labeled Instance |
|
Wu, Rina | Dalian University of Technology |
Zhu, Tianqiang | Dalian University of Technology |
Peng, Wanli | Dalian University of Technology |
Hang, JingLue | Dalian University of Technology |
Sun, Yi | Dalian University of Technology |
Keywords: Dexterous Manipulation, Multifingered Hands, Grasping
Abstract: To assist or replace human beings in completing various tasks, research on the functional grasp synthesis of dexterous hands with high degree-of-freedom (DoF) is necessary and challenging. The dexterous functional grasp requires not only that the grasp is stable but more importantly facilitates the functional manipulation after grasping. Such work still relies on manual annotation when collecting data. To this end, we propose a category-level multi-fingered functional grasp transfer framework, in which we only need to label the hand-object contact relationship on functional parts of one object, and then transfer the contact information through the dense correspondence of functional parts between objects, so as to achieve the functional grasp synthesis for new objects based on the transferred hand-object contact information. We verify this method on three categories of representative objects through simulation experiments and achieve successful functional grasps by labeling only one instance in each category. The project website is: https://github.com/wurina-github/FGTrans.
|
|
WeAT11 Regular session, 251ABC |
Add to My Program |
Swarms |
|
|
Chair: Pinciroli, Carlo | Worcester Polytechnic Institute |
Co-Chair: Gross, Roderich | The University of Sheffield |
|
08:30-08:36, Paper WeAT11.1 | Add to My Program |
CoFlyers: A Universal Platform for Collective Flying of Swarm Drones |
|
Huang, Jialei | Sun Yat-Sen University |
Wang, Fakui | Sun Yat-Sen University |
Hu, Tianjiang | Sun Yat-Sen University |
Keywords: Swarm Robotics, Aerial Systems: Applications
Abstract: Swarm drones flying is a very attractive field of robotics research, motivated by natural bird flocking or other animal collective behaviors. In this paper, we propose and develop an open-source universal platform CoFlyers for end-to-end whole-chain development from flocking-inspired models to real-drone swarm flying. In particular, CoFlyers is more user-friendly with only a unified programming language of MATLAB&Simulink, rather than several existing platforms with mixed programming languages or more efforts on raw functional modules. The prototype simulator of CoFlyers is implemented in MATLAB, allowing users to quickly develop and prototype swarm flying algorithms, and to conduct task-oriented parameter auto-tuning and batch processing within reproducible scenarios. Moreover, a real-world verification module of swarm drones is developed in Simulink as well, which directly calls the prototype simulator modules for code reuse. It connects the external platforms via a standardized user-datagram-protocol communication interface. As a case study, CoFlyers is utilized into a multi-drone collective flying scenario in confined environments, by implementing ROS&PX4&Gazebo for high-fidelity simulation and Optitrack&Tello-drones for experiments. Eventually, both simulation and experimental results have demonstrated and validated the user-friendly practicability of CoFlyers.
|
|
08:36-08:42, Paper WeAT11.2 | Add to My Program |
Collective Decision-Making and Change Detection with Bayesian Robots in Dynamic Environments |
|
Pfister, Kai | University of Luebeck |
Hamann, Heiko | University of Konstanz |
Keywords: Swarm Robotics
Abstract: Solving complex problems collectively with simple entities is a challenging task for swarm robotics. For the task of collective decision-making, robots decide based on local observations on the microscopic level to achieve consensus on the macroscopic level. We study this problem for a common benchmark of classifying distributed features in a binary dynamic environment. Our special focus is on environmental features that are dynamic as they change during the experiment. We present a control algorithm that uses sophisticated statistical change detection in combination with Bayesian robots to classify dynamic environments. The main profit is to reduce false positives allowing for improved speed and accuracy in decision-making. Supported by results from various simulated experiments, we introduce three feedback loops to balance speed and accuracy. In our benchmarks, we show the superiority of our new approach over previous works on Bayesian robots. Our approach of using change detection shows a more reliable detection of environmental changes. This enables the swarm to successfully classify even difficult environments (i.e., hard to detect differences between the binary features), while achieving faster and more accurate results in simpler environments.
|
|
08:42-08:48, Paper WeAT11.3 | Add to My Program |
Autonomous Swarm Robot Coordination Via Mean-Field Control Embedding Multi-Agent Reinforcement Learning |
|
Tang, Huaze | Tsinghua University |
Zhang, Hengxi | Tsinghua University |
Shi, Zhenpeng | Tsinghua University |
Chen, Xinlei | Tsinghua University |
Ding, Wenbo | Tsinghua University |
Zhang, Xiao-Ping | Ryerson University |
Keywords: Swarm Robotics, Reinforcement Learning, Multi-Robot Systems
Abstract: The learning approaches of designing a controller to guide the collective behavior of swarm robots have gained significant attention in recent years. However, the scalability of swarm robots and their inherent stochasticity complicate the control problem due to increasing complexity, unpredictability, and non-linearity. Despite considerable progress made in swarm robotics, addressing these challenges remains a significant issue. In this work, we model the stochastic dynamics of a swarm robot system and then propose a novel control framework based on a mean-field control (MFC) embedding multi-agent reinforcement learning (MARL) approach named MF-MARL to deal with these challenges. While MARL is able to deal with stochasticity statistically, we integrate MFC, allowing MF-MARL to cope with large-scale robots. Moreover, we apply statistical moments of robots' state and control action to discretize continuous input and enable MF-MARL to be applied in continuous scenarios. To demonstrate the effectiveness of MF-MARL, we evaluate the performance of the robots on a specific swarm simulation platform. The experimental results show that our algorithm outperforms the traditional algorithms both in navigation and manipulation tasks. Finally, we demonstrate the adaptability of the proposed algorithm through the component failure test.
|
|
08:48-08:54, Paper WeAT11.4 | Add to My Program |
Multi-Instance Task in Swarm Robotics: Sorting Groups of Robots or Objects into Clusters with Minimalist Controllers |
|
Krischanski, Adilson | Santa Catarina State University (UDESC), |
Kaszubowski Lopes, Yuri | Santa Catarina State University (UDESC) |
Bittencourt Leal, André | Santa Catarina State University – UDESC |
Martins, Ricardo Ferreira | Santa Catarina State University |
Ubertino Rosso, Roberto | Universidade Do Estado De Santa Catarina |
Keywords: Swarm Robotics, Multi-Robot Systems, Behavior-Based Systems
Abstract: Relying only on behaviors that emerge from simple responsive controllers; swarms of robots have been shown capable of autonomously aggregate themselves or objects into clusters without any form of communication. We push these controllers to the limit, requiring robots to sort themselves or objects into different clusters. Based on a responsive controller that maps the current reading of a line-of-sight sensor to a pair of speeds for the robots' differential wheels, we demonstrate how multiple tasks instances can be accomplished by a robotic swarm. Using the dividing rectangles approach and physics simulation, a training step optimizes the parameters of the controller guided by a fitness function. We conducted a series of systematic trials in physics-based simulation and evaluate the performance in terms of dispersion and the ratio of clustered robots/objects. Across 20 trials where 30 robots cluster themselves into 3 groups, an average of 99.83% of them were correctly clustered into their group after 300s. Across 50 trials where 15 robots cluster 30 objects into 3 groups, an average of 61.20%, 82.87%, and 97.73% of objects were correctly clustered into their group after 600s, 900s, and 1800s, respectively. The object cluster behavior scales well while the aggregation does not, the latter due to the requirement of control tuning based on the number of robots.
|
|
08:54-09:00, Paper WeAT11.5 | Add to My Program |
Bio-Inspired 3D Flocking Algorithm with Minimal Information Transfer for Drones Swarms |
|
Verdoucq, Matthieu | Ecole Nationale De l'Aviation Civile |
Sire, Clément | Laboratoire De Physique Théorique, CNRS & Université De Toulouse |
Escobedo, Ramón | Université Paul Sabatier |
Theraulaz, Guy | Universite Paul Sabatier, Toulouse, France |
Hattenberger, Gautier | ENAC, French Civil Aviation University |
Keywords: Swarm Robotics, Biologically-Inspired Robots, Agent-Based Systems
Abstract: This article introduces a bio-inspired 3D flocking algorithm for a drone swarm, built upon a previously established 2D model, which has proven to be effective in promoting stability, alignment, and distance variation between agents within large groups of agents. The study highlights how the incorporation of a vertical interaction between agents and the acquisition by each agent of a minimal amount of information about their most influential neighbor impacts the collective behavior of the swarm. Additionally, we present a comprehensive investigation of the impacts of the intensity of alignment and attraction interactions on the collective motion patterns that emerge at the group level. These results, mostly conducted in a validated simulator, have significant implications for designing efficient UAV swarm systems and using collective patterns, or phases, in operational contexts such as corridor tracking, surveillance, and exploration. Further research will explore the effectiveness and efficiency of this UAV swarm flocking algorithm, as well as its ability to ensure safe transitions between collective phases in different operational contexts.
|
|
09:00-09:06, Paper WeAT11.6 | Add to My Program |
A Generic Framework for Byzantine-Tolerant Consensus Achievement in Robot Swarms |
|
Zhao, Hanqing | McGill University |
Pacheco, Alexandre | Université Libre De Bruxelles |
Strobel, Volker | ULB |
Reina, Andreagiovanni | Université Libre De Bruxelles |
Liu, Xue | McGill University |
Dudek, Gregory | McGill University |
Dorigo, Marco | Université Libre De Bruxelles |
Keywords: Swarm Robotics, Failure Detection and Recovery, Robust/Adaptive Control
Abstract: Recent studies show that some security features that blockchains grant to decentralized networks on the internet can be ported to swarm robotics. Although the integration of blockchain technology and swarm robotics shows great promise, thus far, research has been limited to proof-of-concept scenarios where the blockchain-based mechanisms are tailored to a particular swarm task and operating environment. In this study, we propose a generic framework based on a blockchain smart contract that enables robot swarms to achieve secure consensus in an arbitrary observation space. This means that our framework can be customized to fit different swarm robotics missions, while providing methods to identify and neutralize Byzantine robots, that is, robots which exhibit detrimental behaviours stemming from faults or malicious tampering.
|
|
09:06-09:12, Paper WeAT11.7 | Add to My Program |
Sharing the Control of Robot Swarms among Multiple Human Operators: A User Study |
|
Miyauchi, Genki | The University of Sheffield |
Kaszubowski Lopes, Yuri | Santa Catarina State University (UDESC) |
Gross, Roderich | The University of Sheffield |
Keywords: Swarm Robotics, Multi-Robot Systems, Human Factors and Human-in-the-Loop
Abstract: Simultaneously controlling multiple robot swarms is challenging for a single human operator. When involving multiple operators, however, they can each focus on controlling a specific robot swarm, which helps distribute the cognitive workload. They could also exchange some robots with each other in response to the requirements of the tasks they discover. This paper investigates the ability of multiple operators to dynamically share the control of robot swarms and the effects of different communication types on performance and human factors. A total of 52 participants completed an experiment in which they were randomly paired to form a team. In a 2x2 mixed factorial study, participants were split into two groups by communication type (direct vs. indirect). Both groups experienced different robot-sharing conditions (robot-sharing vs. no-robot-sharing). Results show that although the ability to share robots did not necessarily increase task scores, it allowed the operators to switch between working independently and collaboratively, reduced the total energy consumed by the swarm, and was considered useful by the participants.
|
|
09:12-09:18, Paper WeAT11.8 | Add to My Program |
Decentralized Multi-Agent Reinforcement Learning with Global State Prediction |
|
Bloom, Joshua | Worcester Polytechnic Institute |
Paliwal, Pranjal | Worcester Polytechnic Institute |
Mukherjee, Apratim | Worcester Polytechnic Institute |
Pinciroli, Carlo | Worcester Polytechnic Institute |
Keywords: Swarm Robotics, Multi-Robot Systems, Reinforcement Learning
Abstract: Deep reinforcement learning (DRL) has seen remarkable success in the control of single robots. However, applying DRL to robot swarms presents significant challenges. A critical challenge is non-stationarity, which occurs when two or more robots update individual or shared policies concurrently, thereby engaging in an interdependent training process with no guarantees of convergence. Circumventing non-stationarity typically involves training the robots with global information about other agents' states and/or actions. In contrast, in this paper we explore how to remove the need for global information. We pose our problem as a Partially Observable Markov Decision Process, due to the absence of global knowledge on other agents. Using collective transport as a testbed scenario, we study two approaches to multi-agent training. In the first, the robots exchange no messages, and are trained to rely on implicit communication through push-and-pull on the object to transport. In the second approach, we introduce Global State Prediction (GSP), a network trained to form a belief over the swarm as a whole and predict its future states. We provide a comprehensive study over four well-known deep reinforcement learning algorithms in environments with obstacles, measuring performance as the successful transport of the object to a goal location within a desired time-frame. Through an ablation study, we show that including GSP boosts performance and increases robustness when compared with methods that use global knowledge.
|
|
09:18-09:24, Paper WeAT11.9 | Add to My Program |
Minimalistic Collective Perception with Imperfect Sensors |
|
Chin, Khai Yi | Worcester Polytechnic Institute |
Khaluf, Yara | Wageningen University |
Pinciroli, Carlo | Worcester Polytechnic Institute |
Keywords: Swarm Robotics, Multi-Robot Systems, Distributed Robot Systems
Abstract: Collective perception is a foundational problem in swarm robotics, in which the swarm must reach consensus on a coherent representation of the environment. An important variant of collective perception casts it as a best-of-n decision-making process, in which the swarm must identify the most likely representation out of a set of alternatives. Past work on this variant primarily focused on characterizing how different algorithms navigate the speed-vs-accuracy tradeoff in a scenario where the swarm must decide on the most frequent environmental feature. Crucially, past work on best-of-n decision-making assumes the robot sensors to be perfect (noise- and fault-less), limiting the real-world applicability of these algorithms. In this paper, we apply optimal estimation techniques and a decentralized Kalman filter to derive, from first principles, a probabilistic framework for minimalistic swarm robots equipped with flawed sensors. Then, we validate our approach in a scenario where the swarm collectively decides the frequency of a certain environmental feature. We study the speed and accuracy of the decision-making process with respect to several parameters of interest. Our approach can provide timely and accurate frequency estimates even in presence of severe sensory noise.
|
|
09:24-09:30, Paper WeAT11.10 | Add to My Program |
Onboard Predictive Flocking of Quadcopter Swarm in the Presence of Obstacles and Faulty Robots |
|
Önür, Giray | Middle East Technical University |
Sahin, Mehmet | Middle East Technical University |
Keyvan, Erhan Ege | Middle East Technical University |
Turgut, Ali Emre | University |
Sahin, Erol | Middle East Technical University |
Keywords: Swarm Robotics, Multi-Robot Systems, Distributed Robot Systems
Abstract: Achieving fluent flocking, similar to those observed in birds and fish, on robotic swarms in a desired direction while avoiding obstacles using onboard sensing and computation remains a challenge. In a previous study (Önür et al, Proc. of ANTS'2022), we proposed a predictive flocking model as a computationally efficient method to generate smoother and more robust motion of the swarm. In this study, we extend this model to achieve safe flocking in cluttered environments in the presence of faulty robots that get immobilized during flocking. Systematical evaluation of the model in simulation with different swarm sizes and different faulty robot ratios has shown that safe flocking can be achieved even when 40% of the robots malfunction during flocking. Finally, we validate the model on a swarm of five micro quadcopters using only onboard range and bearing sensors and computation in a distributed manner without any communication.
|
|
09:30-09:36, Paper WeAT11.11 | Add to My Program |
OA-Bug: An Olfactory-Auditory Augmented Bug Algorithm for Swarm Robots in a Denied Environment |
|
Tan, Siqi | Beihang University |
Zhang, Xiaoya | Beihang University |
Li, Jingyao | Beijing University of Aeronautics and Astronautics |
Jing, Ruitao | Beihang University |
Zhao, Mufan | Beihang University |
Liu, Yang | Beihang University, Beijing, P.R.China |
Quan, Quan | Beihang University |
Keywords: Swarm Robotics, Search and Rescue Robots, Behavior-Based Systems
Abstract: Searching in a denied environment is challenging for swarm robots as no assistance from GNSS, mapping, data sharing, and central processing is allowed. However, using olfactory and auditory signals to cooperate like animals could be an important way to improve the collaboration of swarm robots. In this paper, an Olfactory-Auditory augmented Bug algorithm (OA-Bug) is proposed for a swarm of autonomous robots to explore a denied environment. A simulation environment is built to measure the performance of OA-Bug. The coverage of the search task can reach 96.93% using OA-Bug, which is significantly improved compared with a similar algorithm, SGBA. Furthermore, experiments are conducted on real swarm robots to prove the validity of OA-Bug. Results show that OA-Bug can improve the performance of swarm robots in a denied environment. Video: https://youtu.be/vj9cRiSm9eM.
|
|
09:36-09:42, Paper WeAT11.12 | Add to My Program |
Agent Prioritization and Virtual Drag Minimization in Dynamical System Modulation for Obstacle Avoidance of Decentralized Swarms |
|
Douce, Louis-Nicolas | EPFL |
Menichelli, Alessandro | Ecole Polytechnique Fédérale De Lausanne |
Huber, Lukas | EPFL |
Bolotnikova, Anastasia | EPFL |
Paez-Granados, Diego | ETH Zurich |
Ijspeert, Auke | EPFL |
Billard, Aude | EPFL |
Keywords: Swarm Robotics, Path Planning for Multiple Mobile Robots or Agents, Autonomous Vehicle Navigation
Abstract: Efficient and safe multi-agent swarm coordination in environments where humans operate, such as warehouses, assistive living rooms, or automated hospitals, is crucial for adopting automation. In this paper, we augment the obstacle avoidance algorithm based on dynamical system modulation for a swarm of heterogeneous holonomic mobile agents. A smooth prioritization is proposed to change the reactivity of the swarm towards the specific agents. Further, a soft decoupling of the initial agent's kinematics is used to design an independent rotation control to ensure the agent reaches the desired position and orientation simultaneously. This decoupling allowed the introduction of a novel heuristic, the virtual drag. It minimizes the disturbance influence an agent has when moving through its surrounding. Additionally, the safety module adapts the velocity commands from the dynamical system modulation to avoid colliding trajectories between agents. The evaluation was performed in simulated assisted living and hospital environments. The prioritization successfully increased the minimum distance relative to a moving agent. The safety module is observed to create collision-free dynamics where alternative methods fail. Additionally, the repulsive nature of the safety module augments the convergence rate, thus making the proposed method better applicable to dense real-world scenarios.
|
|
09:42-09:48, Paper WeAT11.13 | Add to My Program |
Spontaneous-Ordering Platoon Control for Multirobot Path Navigation Using Guiding Vector Fields (I) |
|
Hu, Binbin | Nanyang Technological University |
Zhang, Hai-Tao | Huazhong University of Science AndTechnology |
Yao, Weijia | University of Groningen |
Ding, Jianing | Huazhong University of Science and Technology |
Cao, Ming | University of Groningen |
Keywords: Swarms, Path Planning for Multiple Mobile Robots or Agents, Multi-Robot Systems, Ordering-Flexible Platoon Control for Multi-Robot Path Navigation Using Guiding Vector Fields
Abstract: In this article, we propose a distributed guidingvector- field (DGVF) algorithm for a team of robots to form a spontaneous-ordering platoon moving along a predefined desired path in then-dimensional Euclidean space. Particularly, by adding a path parameter as an additional virtual coordinate to each robot, the DGVF algorithm can eliminate the singular points where the vector fields vanish, and govern robots to approach a closed and even self-intersecting desired path. Then, the interactions among neighboring robots and a virtual target robot through their virtual coordinates enable the realization of the desired platoon; in particular, relative parametric displacements can be achieved with arbitrary ordering sequences. Rigorous analysis is provided to guarantee the global convergence of the spontaneous-ordering platoon on the common desired path from any initial positions. Two-dimensional experiments using threeHUSTER-0.3 unmanned surface vessels (USVs) are conducted to validate the practical effectiveness of the proposed DGVF algorithm, and 3-D numerical simulations are presented to demonstrate its effectiveness and robustness when tackling higher dimensional multirobot path-navigation missions and some robots breakdown.
|
|
WeAT12 Regular session, 252AB |
Add to My Program |
Tactile Sensing I |
|
|
Chair: Zhang, Fangyi | Queensland University of Technology |
Co-Chair: Chalvatzaki, Georgia | Technische Universität Darmstadt |
|
08:30-08:36, Paper WeAT12.1 | Add to My Program |
DotView: A Low-Cost Compact Tactile Sensor for Pressure, Shear, and Torsion Estimation |
|
Zheng, Haoran | Zhejiang University |
Jin, Yongbin | Zhejiang University, Institute of Applied Mechanics |
Wang, Hongtao | Zhejiang University |
Zhao, Pei | Zhejiang University |
Keywords: Force and Tactile Sensing, Perception for Grasping and Manipulation, Contact Modeling
Abstract: Tactile sensation is one of the most important sensing methods of humans, which helps humans complete complex decisions and actions by itself or in cooperation with other sensations. Similarly, tactile sensors are also crucial for robots to interact with the environment and fulfill dexterous object manipulation. In this study, we present DotView, a compact, low-cost, and high-performance tactile sensor that utilizes a biomimetic micro-structured soft layer to generate rich, high-resolution tactile images captured by a commercialized capacitive sensing array, which exhibits sensitive and robust responses under pressure, shear and torsion applied by different indenters. Furthermore, we utilize a neural network to estimate multiaxial contact forces and torques, and achieve good accuracy in different contact geometries. Finally, we demonstrate the capability of DotView in a dexterous manipulation scenario by performing the task of guiding a tool through a maze with only tactile feedback.
|
|
08:36-08:42, Paper WeAT12.2 | Add to My Program |
How the Fingerprint Effect Applies to Digitized Fingerprint-Like Structures |
|
Kovenburg, Robert | Texas Tech University |
George, Chase | Texas Tech University |
Gale, Richard | Texas Tech University |
Aksak, Burak | Texas Tech University |
Keywords: Force and Tactile Sensing, Soft Sensors and Actuators, Biomimetics
Abstract: The fingerprint effect describes the relationship between slip speed, fingerprint ridge spacing, and the frequency of vibrations created by the movement of a fingerprint across a surface. We have previously shown that the spacing between straight, parallel, evenly spaced ridges in fingerprint-like structures, and thus the vibrations produced by the fingerprint effect, are dependent on the orientation of the ridges with respect to the direction of movement. We also showed that, when ridge orientation is known, the fingerprint effect can be used to estimate slip speed in real-time. The fingerprint effect also applies to the interaction between a surface and other, non-ridge, microstructures. It is, therefore, theoretically possible to use the fingerprint effect generated by these structures to estimate slip speed. However, it is first necessary to understand the nature of the fingerprint effect generated by these non-ridge structures. In this paper, we show that digitized structures, evenly spaced in columns and rows, have a more complex relationship to the fingerprint effect than ridges do. At most orientations, these structures produce vibrations amplified around four frequencies, each determined by a set of virtual ridges defined by the digitized structures. A sensor with 100 µm tall, 150 µm wide micropillars in evenly spaced rows and columns, with a spacing of 300 µm center-to-center, is fabricated. This sensor is tested at angles between 0ş and 90ş by 15ş increments. The results support our theoretical analysis.
|
|
08:42-08:48, Paper WeAT12.3 | Add to My Program |
Simulation, Learning, and Application of Vision-Based Tactile Sensing at Large Scale (I) |
|
Ho, Van | Japan Advanced Institute of Science and Technology |
Luu, Quan | Japan Advanced Institute of Science and Technology |
Nguyen, Nhan Huu | Japan Advanced Institute of Science and Technology |
Keywords: Force and Tactile Sensing, Soft Sensors and Actuators, Soft Robot Materials and Design, Modeling, Control, and Learning for Soft Robots
Abstract: Large scale robotic skin with tactile sensing ability is emerging with the potential for use in close contact human-robot systems. Learning perception for such tactile devices demands a huge tactile dataset. We introduces a multi-physics simulation pipeline, SimTacLS, which considers not only the mechanical properties of external physical contact, but also realistic rendering of tactile images in a simulation environment. The system utilizes the obtained simulation dataset, including virtual images and skin deformation, to train a tactile deep neural network to extract high-level tactile information. Moreover, we adopted a generative network to minimize sim2real inaccuracy, preserving the simulation-based tactile sensing performance. Last but not least, we showcased this sim2real sensing method for our large-scale tactile sensor (TacLink) by demonstrating its use in two trial cases, namely whole-arm nonprehensile manipulation, and intuitive motion guidance, using a custom-built tactile robot arm integrated with TacLink. This work opens new possibilities in learning of transferable tactile-driven robotics tasks from virtual worlds to actual scenarios.
|
|
08:48-08:54, Paper WeAT12.4 | Add to My Program |
A Two-Dimensional Reticular Core Optical Waveguide Sensor for Tactile and Positioning Sensing |
|
Liu, Zeyu | Institute of Automation, Chinese Acadamy of Sciences |
Li, Zhengwei | Institute of Automation, Chinese Academy of Sciences |
Cheng, Long | Chinese Academy of Sciences |
Keywords: Force and Tactile Sensing, Soft Sensors and Actuators, Soft Robot Materials and Design
Abstract: Tactile sensors based on optical waveguides are highly sensitive to pressure, possess good chemical inertness and electromagnetic resistance, and are unaffected by temperature changes in the surrounding environment. Researchers have developed various waveguide structures with multi-level cores to simultaneously measure tactile forces and positions. However, these designs result in thicker waveguides and reduced sensitivity in the lower levels. This study introduces a two-dimensional reticular core optical waveguide for tactile force and positioning sensing, where vertical waveguides intersect each other. The reticular core reduces waveguide thickness and simplifies fabrication processes. The simulation investigates the characteristics of light propagation and geometric parameters. Experimental results confirm the proposed reticular waveguide's force-sensing capability, with an average sensitivity of 0.36 dB/N. Compared to the split-level structure, the reticular waveguide demonstrates more consistent sensitivities along the two shear directions. Utilizing a deep neural network, the spatial resolution achieves approximately 0.72 mm along the X-axis and 1.14 mm along the Y-axis, outperforming the split-level structure.
|
|
08:54-09:00, Paper WeAT12.5 | Add to My Program |
Sliding Touch-Based Exploration for Modeling Unknown Object Shape with Multi-Fingered Hands |
|
Chen, Yiting | Wuhan University |
Tekden, Ahmet | Chalmers University of Technology |
Deisenroth, Marc Peter | University College London |
Bekiroglu, Yasemin | Chalmers University of Technology, University College London |
Keywords: Force and Tactile Sensing, Perception for Grasping and Manipulation, Perception-Action Coupling
Abstract: Efficient and accurate 3D object shape reconstruction contributes significantly to the success of a robot's physical interaction with its environment. Acquiring accurate shape information about unknown objects is challenging, especially in unstructured environments, e.g. the vision sensors may only be able to provide a partial view. To address this issue, tactile sensors could be employed to extract local surface information for more robust unknown object shape estimation. In this paper, we propose a novel approach for efficient unknown 3D object shape exploration and reconstruction using a multi-fingered hand equipped with tactile sensors and a depth camera only providing a partial view. We present a multi-finger sliding touch strategy for efficient shape exploration using a Bayesian Optimization approach and a single-leader-multi-follower strategy for multi-finger smooth local surface perception. We evaluate our proposed method by estimating the 3D shape of objects from the YCB and OCRTOC datasets based on simulation and real robot experiments. The proposed approach yields successful reconstruction results relying on only a few continuous sliding touches. Experimental results demonstrate that our method is able to model unknown objects in an efficient and accurate way.
|
|
09:00-09:06, Paper WeAT12.6 | Add to My Program |
Re-Evaluating Parallel Finger-Tip Tactile Sensing for Inferring Object Adjectives: An Empirical Study |
|
Zhang, Fangyi | Queensland University of Technology |
Corke, Peter | Queensland University of Technology |
Keywords: Force and Tactile Sensing, Object Detection, Segmentation and Categorization
Abstract: Finger-tip tactile sensors are increasingly used for robotic sensing to establish stable grasps and to infer object properties. Promising performance has been shown in a number of works for inferring adjectives that describe the object, but there remains a question about how each taxel contributes to the performance. This paper explores this question with empirical experiments, leading insights for future finger-tip tactile sensor usage and design: one tactile sensor instead of a pair of sensors is sufficient for symmetric objects and interaction motions; dense taxels are beneficial for texture-related adjectives, but can be distracting to non-texture-related ones; and a frame-rate much lower than the BioTac sensor can satisfy the demand of inferring object adjectives in the PHAC-2 dataset.
|
|
09:06-09:12, Paper WeAT12.7 | Add to My Program |
Content Estimation through Tactile Interactions with Deformable Containers |
|
Liu, Yu-En | National Yang Ming Chiao Tung University |
Chai, Chun-Yu | National Yang Ming Chiao Tung University |
Chen, Yi-Ting | National Yang Ming Chiao Tung University |
Tsao, Shiao-Li | National Yang Ming Chiao Tung University |
Keywords: Force and Tactile Sensing, Perception for Grasping and Manipulation, AI-Based Methods
Abstract: Pouring snacks and moving containers with beverages are challenging for a service robot. To obtain accurate content properties for planning robotic motion, tactile sensing can provide information about the pressure distribution of the contact surface, which is not obvious by visual observation. In this work, we focus on estimating the content properties of various content materials in distinct deformable containers through tactile interactions. We propose a learning-based model that can estimate content properties by using the tactile data collected by slightly squeezing a container with the content of interest. We analyzed an uncalibrated tactile sensor and collected a dataset consisting of 1125 sets of tactile sequences, which are combinations of five types of deformable containers and eleven types of content materials in different content heights. Experiments were conducted on content estimation with known contents and containers, unknown contents, and unknown containers. For unknown contents, our model can still achieve 8.5% height relative error and 79.7% state of matter accuracy. Furthermore, we analyzed that the tactile features of contents with similar content properties are close in the latent space to show the effectiveness of our model.
|
|
09:12-09:18, Paper WeAT12.8 | Add to My Program |
Placing by Touching: An Empirical Study on the Importance of Tactile Sensing for Precise Object Placing |
|
Lach, Luca | Bielefeld University |
Funk, Niklas Wilhelm | TU Darmstadt |
Haschke, Robert | Bielefeld University |
Lemaignan, Séverin | PAL Robotics |
Ritter, Helge Joachim | Bielefeld University |
Peters, Jan | Technische Universität Darmstadt |
Chalvatzaki, Georgia | Technische Universität Darmstadt |
Keywords: Force and Tactile Sensing, Deep Learning in Grasping and Manipulation, Perception for Grasping and Manipulation
Abstract: This work deals with a practical everyday problem: stable object placement on flat surfaces starting from unknown initial poses. Common object-placing approaches require either complete scene specifications or extrinsic sensor measurements, e.g., cameras, that occasionally suffer from occlusions. We propose a novel approach for stable object placing that combines tactile feedback and proprioceptive sensing. We devise a neural architecture that estimates a rotation matrix, resulting in a corrective gripper movement that aligns the object with the placing surface for the subsequent object manipulation. We compare models with different sensing modalities, such as force-torque and an external motion capture system, in real-world object placing tasks with different objects. The experimental evaluation of our placing policies with a set of unseen everyday objects reveals significant generalization of our proposed pipeline, suggesting that tactile sensing plays a vital role in the intrinsic understanding of robotic dexterous object manipulation. Code, models, and supplementary videos are available on https://sites.google.com/view/placing-by-touching.
|
|
09:18-09:24, Paper WeAT12.9 | Add to My Program |
Incipient Slip Detection with a Biomimetic Skin Morphology |
|
Cordova Bulens, David | University College Dublin |
Lepora, Nathan | University of Bristol |
Redmond, Stephen | University College Dublin |
Ward-Cherrier, Benjamin | University of Bristol |
Keywords: Force and Tactile Sensing, Soft Sensors and Actuators
Abstract: Incipient slip is defined as the slippage of part, but not all, of the contact surface between a sensor and an object. Reliably detecting incipient slip in artificial tactile sensors would benefit autonomous robot handling capabilities by helping prevent object slippage during manipulation. Here, we present a biomimetic skin morphology based on the human fingerprint with application to marker-based tactile sensors such as the TacTip biomimetic optical tactile sensor. We modify the 3D printed outer membrane of the TacTip to mimic glabrous skin morphology with the inclusion of external ridges (fingerprint) and internal markers (intermediate ridges), allowing localised shear deformation of the sensor’s skin prior to the onset of gross slip. To validate the performance of this skin morphology, we train a random forest classifier (RFC) to identify incipient slip based on the extracted marker displacements from the sensor when it is compressed against an acrylic plate and moved laterally. The RFC model achieves 97.46% accuracy on incipient slip prediction, and is then validated on an unseen pouring task, in which gravity-induced incipient slip is detected on average within 418 ± 752 ms of its onset, and before gross slip in all trials. This accurate detection of incipient slip enables corrective actions prior to the onset of gross slip, a key capability in robotic manipulation and upper-limb prosthetics.
|
|
09:24-09:30, Paper WeAT12.10 | Add to My Program |
GelSight Svelte: A Human Finger-Shaped Single-Camera Tactile Robot Finger with Large Sensing Coverage and Proprioceptive Sensing |
|
Zhao, Jialiang | Massachusetts Institute of Technology |
Adelson, Edward | MIT |
Keywords: Force and Tactile Sensing, Perception for Grasping and Manipulation, Multifingered Hands
Abstract: Camera-based tactile sensing is a low-cost, popular approach to obtain highly detailed contact geometry information. However, most existing camera-based tactile sensors are fingertip sensors, and longer fingers often require extraneous elements to obtain an extended sensing area similar to the full length of a human finger. Moreover, existing methods to estimate proprioceptive information such as total forces and torques applied on the finger from camera-based tactile sensors are not effective when the contact geometry is complex. We introduce GelSight Svelte, a curved, human finger-sized, single-camera tactile sensor that is capable of both tactile and proprioceptive sensing over a large area. GelSight Svelte uses curved mirrors to achieve the desired shape and sensing coverage. Proprioceptive information, such as the total bending and twisting torques applied on the finger, is reflected as deformations on the flexible backbone of GelSight Svelte, which are also captured by the camera. We train a convolutional neural network to estimate the bending and twisting torques from the captured images. We conduct gel deformation experiments at various locations of the finger to evaluate the tactile sensing capability and proprioceptive sensing accuracy. To demonstrate the capability and potential uses of GelSight Svelte, we conduct an object holding task with three different grasping modes that utilize different areas of the finger.
|
|
09:30-09:36, Paper WeAT12.11 | Add to My Program |
Estimating Properties of Solid Particles Inside Container Using Touch Sensing |
|
Guo, Xiaofeng | Carnegie Mellon Univeristy |
Huang, Hung-Jui | ISEE AI |
Yuan, Wenzhen | Carnegie Mellon University |
Keywords: Force and Tactile Sensing
Abstract: Solid particles, such as rice and coffee beans, are commonly stored in containers and are ubiquitous in our daily lives. Understanding those particles’ properties could help us make later decisions or perform later manipulation tasks such as pouring. Humans typically interact with the containers to get an understanding of the particles inside them, but it is still a challenge for robots to achieve that. This work utilizes tactile sensing to estimate multiple properties of solid particles enclosed in the container, specifically, content mass, content volume, particle size, and particle shape. We design a sequence of robot actions to interact with the container. Based on physical understanding, we extract static force/torque value from the F/T sensor, vibration-related features and topple-related features from the newly designed high-speed GelSight tactile sensor to estimate those four particle properties. We test our method on 37 very different daily particles, including powder, rice, beans, tablets, etc. Experiments show that our approach is able to estimate content mass with an error of 1.8 g, content volume with an error of 6.1 ml, particle size with an error of 1.1 mm, and achieves an accuracy of 75.6% for particle shape estimation. In addition, our method can generalize to unseen particles with unknown volumes. By estimating these particle properties, our method can help robots to better perceive the granular media and help with different manipulation tasks in daily life and industry.
|
|
09:36-09:42, Paper WeAT12.12 | Add to My Program |
Acquisition and Prediction of High-Density Tactile Field Data for Rigid and Flexible Objects |
|
Xue, Hongxiang | Fudan University |
Liu, Pengkun | Fudan University |
Ju, Zhaoxun | Fudan University |
Sun, Fuchun | Tsinghua Univerisity |
Keywords: Force and Tactile Sensing, Haptics and Haptic Interfaces
Abstract: Obtaining high-density tactile field information is a critical aspect of research in the field of robotic haptics, as it plays a decisive role in determining the precision of robot manipulations. Vision-based tactile sensors have unique high-resolution features, which make them promising for related research. However, previous studies have mainly focused on reconstructing the shape of rigid objects or predicting the three-dimensional force of rigid objects, neglecting the analysis of flexible objects. Moreover, due to the resolution limitations of existing commercial sensors, the performance evaluation of previous force prediction models relied solely on the total force. To overcome these limitations and in order to explore the tactile field information of objects with more attributes, this paper presents a detailed high-density tactile field data acquisition method based on a mechanical simulation environment. Additionally, we constructed a network to learn the mapping relationship between tactile images and six-dimensional tactile field information. Our results demonstrate that the proposed method can predict the three-dimensional force and displacement information of the object. Notably, the prediction error is within the tolerance range for fine manipulation by robots.
|
|
09:42-09:48, Paper WeAT12.13 | Add to My Program |
Simultaneous Shape and Tip Force Sensing for the COAST Guidewire Robot |
|
Deaton, Nancy Joanna | Georgia Institute of Technology |
Brumfiel, Timothy A. | Georgia Institute of Technology |
Sarma, Achraj | Georgia Institute of Technology |
Desai, Jaydev P. | Georgia Institute of Technology |
Keywords: Tendon/Wire Mechanism, Compliant Joints and Mechanisms
Abstract: Placement of catheters in minimally invasive cardiovascular procedures is preceded by navigating to the target lesion with a guidewire. Traversing through tortuous vascular pathways can be challenging without precise tip control, potentially resulting in the damage or perforation of blood vessels. To improve guidewire navigation, this paper presents 3D shape reconstruction and tip force sensing for the COaxially Aligned STeerable (COAST) guidewire robot using a triplet of adhered single core fiber Bragg grating sensors routed centrally through the robot’s slender structure. Additionally, several shape reconstruction algorithms are compared, and shape measurements are utilized to enable tip force sensing. Demonstration of the capabilities of the robot is shown in free air where the shape of the robot is reconstructed with average errors less than 2mm at the guidewire tip, and the magnitudes of forces applied to the tip are estimated with an RMSE of 0.027N or less.
|
|
WeAT13 Regular session, 260 Portside Ballroom |
Add to My Program |
Vision-Based Navigation III |
|
|
Chair: Deguchi, Hideki | Toyota Central R&D Labs., Inc |
Co-Chair: Natale, Lorenzo | Istituto Italiano Di Tecnologia |
|
08:30-08:36, Paper WeAT13.1 | Add to My Program |
Calibration-Free BEV Representation for Infrastructure Perception |
|
Fan, Siqi | Tsinghua University |
Wang, Zhe | Institute for AI Industry Research, Tsinghua University |
Huo, Xiaoliang | Beihang University |
Wang, Yan | Tsinghua University |
Liu, Jingjing | Institute for AI Industry Research (AIR), Tsinghua University |
Keywords: Computer Vision for Transportation, Object Detection, Segmentation and Categorization, Intelligent Transportation Systems
Abstract: Effective BEV object detection on infrastructure can greatly improve traffic scene understanding and vehicle-to-infrastructure (V2I) cooperative perception. However, cameras installed on infrastructure have various postures, and previous BEV detection methods rely on accurate calibration, which is difficult for practical applications due to inevitable natural factors (e.g., wind and snow). In this paper, we propose a Calibration-free BEV Representation (CBR) network, which achieves 3D detection based on BEV representation without calibration parameters and additional depth supervision. Specifically, we utilize two multi-layer perceptrons for decoupling the features from perspective view to front view and bird-eye view under boxes-induced foreground supervision. Then, a cross-view feature fusion module matches features from orthogonal views according to similarity and conducts BEV feature enhancement with front-view features. Experimental results on DAIR-V2X demonstrate that CBR achieves acceptable performance without any camera parameters and is naturally not affected by calibration noises. We hope CBR can serve as a baseline for future research addressing practical challenges of infrastructure perception.
|
|
08:36-08:42, Paper WeAT13.2 | Add to My Program |
UVSS: Unified Video Stabilization and Stitching for Surround View of Tractor-Trailer Vehicles |
|
Zhu, Chunhui | Beijing Institute of Technology |
Yang, Yi | Beijing Institute of Technology |
Liang, Hao | Beijing Institute of Technology |
Dong, Zhipeng | Beijing Institute of Technology |
Fu, Mengyin | Beijing Institute of Technology |
Keywords: Omnidirectional Vision, Computer Vision for Transportation, Intelligent Transportation Systems
Abstract: Automotive surround-view camera systems have been commonly employed in automated driving to aid in near-field sensing and other perception tasks. Due to the large size of the body and the presence of multiple blind spots, panoramic surround-view systems are particularly crucial for tractor-trailer vehicles. However, the non-rigid body of tractor-trailer vehicles introduces pose changes between cameras, rendering traditional calibration-based methods inadequate. Additionally, cameras mounted separately on the tractor and the trailer will experience independent vibrations, resulting in undesirable shakiness in captured videos. In this paper, we propose a unified video stabilization and stitching method to address these challenges, which can smooth the unsteady frames and align the images from moving cameras. Delving into video stabilization techniques, we extend mesh-based motion model for unified stitching and leverage deep-learning based modules to handle complex real-world scenarios. Moreover, we design a new optimization framework to estimate the optimal displacements of mesh vertices, enabling simultaneous stabilization and stitching of frames. The experimental results, obtained by public datasets and videos captured from a model tractor-trailer vehicle, demonstrate that our approach outperforms previous methods and is highly effective in real-world applications.
|
|
08:42-08:48, Paper WeAT13.3 | Add to My Program |
Falcon: A Wide-And-Deep Onboard Active Vision System |
|
Hirano, Masahiro | The University of Tokyo |
Yamakawa, Yuji | The University of Tokyo |
Keywords: Computer Vision for Transportation, Intelligent Transportation Systems, Autonomous Vehicle Navigation
Abstract: The tradeoff between the field-of-view and resolution of conventional onboard vision systems primarily results from their fixed optical components. We propose a novel active vision system, Falcon, as an optimal solution. This system comprises an electric zoom lens connected to a high-speed camera with a pair of galvanometer mirrors, enabling high-resolution imaging of a moving object across a wide range, from near to far. To ensure accurate calibration of the Falcon system, we introduce a novel mapping-based calibration method using external cameras. We also present a robust and lightweight visual feedback method that utilizes this mapping-based calibration for effective object tracking. The effectiveness of the Falcon system is verified by constructing a prototype and conducting tracking experiments in an indoor setting, which demonstrated the superior performance of our method. Additionally, we successfully achieved continuous and high-resolution imaging of a curved mirror on public roads while the vehicle was moving.
|
|
08:48-08:54, Paper WeAT13.4 | Add to My Program |
Driver Distraction Detection for Daytime and Nighttime with Unpaired Visible and Infrared Image Translation |
|
Shen, Hong-Ze | National Chung Cheng University |
Lin, Huei-Yung | National Taipei University of Technology |
Keywords: Computer Vision for Transportation, Intelligent Transportation Systems
Abstract: Driver distraction detection is an important function of driver monitoring systems and intelligent vehicles. Most previous research only focuses on the system development for daytime operations. In this paper, we propose a network model, V2IA-Net, which is able to use the daytime visible and nighttime infrared images for the driver distraction detection task. With the visible-infrared image translation, driver action recognition and head pose detection, the driver distraction behavior can be analyzed in real-time performance. To provide realistic driving scenes for network training and testing, a visible-infrared image dataset, VID, is created. The proposed V2IA-Net is trained on the unpaired images, and capable of common feature extraction for visible-infrared image conversion. In the experiments, our technique is compared with various driver distraction detection models. The results have demonstrated the effectiveness of the proposed method. Source code and datasets are available at https://github.com/kk2487/V2IA-Net.
|
|
08:54-09:00, Paper WeAT13.5 | Add to My Program |
Long-Short Term Policy for Visual Object Navigation |
|
Bai, Yubing | Institute of Computing Technology, Chinese Academy of Sciences |
Song, Xinhang | Institute of Computing Technology, Chinese Academy of Sciences |
Li, Weijie | Institute of Computing Technology, Chinese Academy of Sciences |
Zhang, Sixian | ICT, UCAS |
Jiang, Shuqiang | Institute of Computing Technology, Chinese Academy of Sciences |
Keywords: Vision-Based Navigation
Abstract: The goal of visual object navigation for an agent is to find the target objects accurately. Recent works mainly focus on the feature of embedding, attempting to learn better features with different variants, such as object distribution and graph representations. However, some typical navigation problems in complex environments, such as partially known and obstacle problems, may not be effectively addressed by previous feature embedding methods. In this paper, we propose a framework with a long-short objective policy, where the hidden states are classified according to the navigation objectives at that moment and separately rewarded. Specifically, we consider two objectives: the long-term objective is to go closer to the target, and the short-term objective is for obstacle avoidance and exploration. To alleviate the effect of long-term and short-term alternation, we build a state memory and propose an adjustment gate to update the state memory. Finally, all past hidden states are reweighted and combined for action prediction with an action-boosting gate. Experimental results on RoboTHOR show that the proposed method can significantly outperform the state-of-the-art.
|
|
09:00-09:06, Paper WeAT13.6 | Add to My Program |
Lidar-Based Multiple Object Tracking with Occlusion Handling |
|
Ho, Ruo-Tsz | National Yang Ming Chiao Tung University |
Wang, Chieh-Chih | National Yang Ming Chiao Tung University |
Lin, Wen-Chieh | National Yang Ming Chiao Tung University |
Keywords: Visual Tracking
Abstract: Occlusion remains an issue in multiple object tracking, which could cause ambiguity in object detection, such as incorrect or missing detection. Under occlusion, a track could experience an early termination, resulting in identity switches and/or fragmentation. To recover from different lengths of occlusions, the track should be maintained by considering its occlusion status. To address the issues mentioned above, we propose an indicator that can model the track's occlusion extent via geometric information provided by LiDAR data. Through incorporating the indicator into the track management and data association process, it is feasible to prevent tracks from premature termination. The proposed method is evaluated on the collected dataset which undergoes frequent and severe occlusions. Compared to the state-of-the-art probabilistic tracking approach, our approach achieves improvements of 3.26% in MOTA and 5.36% in IDF1. Additionally, we obtain 9.89% improvements in IDF1 specifically for objects experiencing severe occlusions.
|
|
09:06-09:12, Paper WeAT13.7 | Add to My Program |
Local and Global Information in Obstacle Detection on Railway Tracks |
|
Brucker, Matthias | ETH Zürich |
Cramariuc, Andrei | ETHZ |
von Einem, Cornelius | ETH Zürich |
Siegwart, Roland | ETH Zurich |
Cadena Lerma, Cesar | ETH Zurich |
Keywords: Computer Vision for Transportation, Object Detection, Segmentation and Categorization, Intelligent Transportation Systems
Abstract: Reliable obstacle detection on railways could help prevent collisions that result in injuries and potentially damage or derail the train. Unfortunately, generic object detectors do not have enough classes to account for all possible scenarios, and datasets featuring objects on railways are challenging to obtain. We propose utilizing a shallow network to learn railway segmentation from normal railway images. The limited receptive field of the network prevents overconfident predictions and allows the network to focus on the locally very distinct and repetitive patterns of the railway environment. Additionally, we explore the controlled inclusion of global information by learning to hallucinate obstacle-free images. We evaluate our method on a custom dataset featuring railway images with artificially augmented obstacles. Our proposed method outperforms other learning-based baseline methods.
|
|
09:12-09:18, Paper WeAT13.8 | Add to My Program |
Hybrid Object Tracking with Events and Frames |
|
Li, Zhichao | Istituto Italiano Di Tecnologia |
Piga, Nicola Agostino | Istituto Italiano Di Tecnologia |
Di Pietro, Franco | Istituto Italiano Di Tecnologia |
Iacono, Massimiliano | Istituto Italiano Di Tecnologia |
Glover, Arren | Istituto Italiano Di Tecnologia |
Natale, Lorenzo | Istituto Italiano Di Tecnologia |
Bartolozzi, Chiara | Istituto Italiano Di Tecnologia |
Keywords: Visual Tracking
Abstract: Robust object pose tracking plays an important role in robot manipulation, but it is still an open issue for quickly moving targets as motion blur and low frequency detection can reduce pose estimation accuracy even for state-of-the-art RGB-D-based methods. An event-camera is a low-latency vision sensor that can act complementary to RGB-D. Specifically, its submillisecond temporal resolution can be exploited to correct for pose estimation inaccuracies due to low frequency RGB-D based detection. To do so, we propose a dual Kalman filter: the first filter estimates an object’s velocity from the spatio-temporal patterns of “events”, the second filter fuses the tracked object velocity with a low-frequency object pose estimated from a deep neural network using RGB-D data. The full system outputs high frequency, accurate object poses also for fast moving objects. The proposed method works towards low-power robotics by replacing high-cost GPU-based optical flow used in prior work with event-cameras that inherently extract the required signal without costly processing. The proposed algorithm achieves comparable or better performance when compared to two state-of-the-art 6-DoF object pose estimation algorithms and one hybrid event/RGB-D algorithm on benchmarks with simulated and real data. We discuss the benefits and trade-offs for using the event-camera and contribute algorithm, code, and datasets to the community. The code and datasets are available at https://github.com/event-driven-robotics/Hybrid-object-tracking-with-events-and-frames.
|
|
09:18-09:24, Paper WeAT13.9 | Add to My Program |
Semantic Segmentation Based on Multiple Granularity Learning |
|
Wu, Kebin | Technology Innovation Institute |
Bawazir, Ameera | Technology Innovation Institute |
Xiao, Xiaofei | Technology Innovation Institute |
Avula, Venkata Seetharama Sai Bhargav Kumar | Technology Innovation Institute |
Almazrouei, Ebtesam | Technology Innovation Institute |
Roura, Eloy | Technology Innovation Institute |
Debbah, Merouane | Technology Innovation Institute |
Keywords: Computer Vision for Transportation, Object Detection, Segmentation and Categorization, Autonomous Vehicle Navigation
Abstract: Accurate and robust coarse semantic segmentation plays a key role in the pursuit of autonomous driving. We present an algorithm that regularizes the representation space of Semantic Segmentation by Multiple Granularity Learning (SSMGL). This approach explores multiple levels of semantic knowledge in an unified framework, where the fine-grained semantic information can be either labeled or unlabeled. In our experiments, we find that SSMGL can achieve better results (1) on both on-road and off-road benchmarks, (2) under different segmentation architectures, or (3) with different backbones. The method is plug-and-play, not specialized for autonomous driving applications, and can be easily extended to any other segmentation scenario. Moreover, our SSMGL approach does not increase the computational overhead in the inference stage.
|
|
09:24-09:30, Paper WeAT13.10 | Add to My Program |
Enhanced Robot Navigation with Human Geometric Instruction |
|
Deguchi, Hideki | Toyota Central R&D Labs., Inc |
Taguchi, Shun | Toyota Central R&D Labs., Inc |
Shibata, Kazuki | Toyota Central R&D Labs., INC |
Koide, Satoshi | TOYOTA Central R&D Labs., INC |
Keywords: Human-Centered Robotics, Human-Robot Collaboration, Vision-Based Navigation
Abstract: Recently, robot navigation methods using human instructions have been actively studied, including visual language navigation. Although language is one of the most promising forms of instruction, words often contain ambiguities. To complement this problem, we propose to use geometric instruction as a clue to the task goal. Specifically, in our proposed system, we assume that the robot receives a rough position of the target from human gesture. The robot adaptively estimates the reliability of this geometric instruction, and switches between exploration and instruction-following modes depending on the reliability value. We conducted evaluation of our method using a 3D simulation environment, and show that the task success rate and other metrics improve compared with the baseline methods.
|
|
09:30-09:36, Paper WeAT13.11 | Add to My Program |
InterTracker: Discovering and Tracking General Objects Interacting with Hands in the Wild |
|
Shao, Yanyan | Zhejiang University of Technology |
Ye, Qi | Zhejiang University |
Luo, Wenhan | Sun Yat-Sen University |
Zhang, Kaihao | Australian National University |
Chen, Jiming | Zhejiang University |
Keywords: Visual Tracking, Object Detection, Segmentation and Categorization, Visual Learning
Abstract: Understanding human interaction with objects is an important research topic for embodied Artificial Intelligence and identifying the objects that humans are interacting with is a primary problem for interaction understanding. Existing methods rely on frame-based detectors to locate interacting objects. However, this approach is subjected to heavy occlusions, background clutter, and distracting objects. To address the limitations, in this paper, we propose to leverage spatio-temporal information of hand-object interaction to track interactive objects under these challenging cases. Without prior knowledge of the general objects to be tracked like object tracking problems, we first utilize the spatial relation between hands and objects to adaptively discover the interacting objects from the scene. Second, the consistency and continuity of the appearance of objects between successive frames are exploited to track the objects. With this tracking formulation, our method also benefits from training on large-scale general object-tracking datasets. We further curate a video-level hand-object interaction dataset for testing and evaluation from 100DOH. The quantitative results demonstrate that our proposed method outperforms the state-of-the-art methods. Specifically, in scenes with continuous interaction with different objects, we achieve an impressive improvement of about 10% as evaluated using the Average Precision (AP) metric. Our qualitative findings also illustrate that our method can produce more continuous trajectories for interacting objects.
|
|
09:36-09:42, Paper WeAT13.12 | Add to My Program |
Multidimensional Particle Filter for Long-Term Visual Teach and Repeat in Changing Environments |
|
Rozsypálek, Zdeněk | Czech Technical University in Prague |
Rouček, Tomáš | Czech Technical University in Prague |
Vintr, Tomas | FEE, Czech Technical University in Prague |
Krajník, Tomáš | Czech Technical University |
Keywords: Vision-Based Navigation, Probabilistic Inference, Sensor Fusion
Abstract: When a mobile robot is asked to navigate intelligently in an environment, it needs to estimate its own and the environment's state. One of the popular methods for robot state and position estimation is particle filtering (PF). Visual Teach and Repeat (VT&R) is a type of navigation that uses a camera to navigate the robot along the previously traversed path. Particle filters are usually used in VT&R to fuse data from odometry and camera to estimate the distance traveled along the path. However, in VT&R, there are other valuable states that the robot can benefit from, especially when moving through changing environments. We propose a multidimensional particle filter to estimate the robot state in VT&R navigation. Apart from the traveled distance, our particle filter estimates lateral and heading deviation from the taught path as well as the current appearance of the environment. This appearance is estimated using maps created across various environmental conditions recorded during the previous traversals. The joint state estimation is based on contrastive neural network architecture, allowing self-supervised learning. This architecture can process multiple images in parallel, alleviating the potential overhead caused by computing the particle filter over the maps simultaneously. We conducted experiments to show that the joint robot/environment state estimation improves navigation accuracy and robustness in a continual mapping setup. Unlike the other frameworks, which treat the robot position and environment appearance separately, our PF represents them as one multidimensional state, resulting in a more general uncertainty model for VT&R.
|
|
09:42-09:48, Paper WeAT13.13 | Add to My Program |
Learning When to Use Adaptive Adversarial Image Perturbations against Autonomous Vehicles |
|
Yoon, Hyungjin | University of Nevada, Reno |
Jafarnejadsani, Hamidreza | Stevens Institute of Technology |
Voulgaris, Petros G. | University of Illinois at Urbana-Champaign |
Keywords: Computer Vision for Transportation, Reinforcement Learning, Machine Learning for Robot Control
Abstract: Deep neural network (DNN) models are widely used in autonomous vehicles for object detection using camera images. However, these models are vulnerable to adversarial image perturbations. Existing methods for generating these perturbations use each incoming image frame as the decision variable, resulting in a computationally expensive optimization process that starts over for each new image. Few approaches have been developed for attacking online image streams while considering the physical dynamics of autonomous vehicles, their mission, and the environment. To address these challenges, we propose a multi-level stochastic optimization framework that monitors the attacker's capability to generate adversarial perturbations. Our framework introduces a binary decision attack/not attack based on the attacker's capability level to enhance its effectiveness. We evaluate our proposed framework using simulations for vision-guided autonomous vehicles and actual tests with a small indoor drone in an office environment. Our results demonstrate that our method is capable of generating real-time image attacks while monitoring the attacker's proficiency given state estimates.
|
|
WeAT14 Regular session, 320 |
Add to My Program |
SLAM I |
|
|
Chair: Spenko, Matthew | Illinois Institute of Technology |
Co-Chair: Koide, Kenji | National Institute of Advanced Industrial Science and Technology |
|
08:30-08:36, Paper WeAT14.1 | Add to My Program |
ECTLO: Effective Continuous-Time Odometry Using Range Image for LiDAR with Small FoV |
|
Zheng, Xin | Zhejiang University |
Zhu, Jianke | Zhejiang University |
Keywords: SLAM, Mapping, Range Sensing
Abstract: Prism-based LiDARs are more compact and cheaper than the conventional mechanical multi-line spinning LiDARs, which have become increasingly popular in robotics, recently. However, there are several challenges for these new LiDAR sensors, including small field of view, severe motion distortions, and irregular patterns. These difficulties hinder them from being widely used in LiDAR odometry, practically. To tackle these problems, we present an effective continuous-time LiDAR odometry (ECTLO) method for the Risley-prism-based LiDARs with non-repetitive scanning patterns. A single range image covering historical points in LiDAR's small FoV is adopted for efficient map representation. To account for the noisy data from occlusions after map updating, a filter-based point-to-plane Gaussian Mixture Model is used for robust registration. Moreover, a LiDAR-only continuous-time motion model is employed to relieve the inevitable distortions. Extensive experiments have been conducted on various testbeds using the prism-based LiDARs with different scanning patterns, whose promising results demonstrate the efficacy of our proposed approach.
|
|
08:36-08:42, Paper WeAT14.2 | Add to My Program |
An Efficient Global Optimality Certificate for Landmark-Based SLAM |
|
Holmes, Connor | University of Toronto |
Barfoot, Timothy | University of Toronto |
Keywords: SLAM, Optimization and Optimal Control
Abstract: Modern state estimation is often formulated as an optimization problem and solved using efficient local search methods. These methods at best guarantee convergence to local minima, but, in some cases, global optimality can also be certified. Although such global optimality certificates have been well established for 3D textit{pose-graph optimization}, the details have yet to be worked out for the 3D landmark-based SLAM problem, in which estimated states include both robot poses and map landmarks. In this paper, we address this gap by using a graph-theoretic approach to cast the subproblems of landmark-based SLAM into a form that yields a sufficient condition for global optimality. Efficient methods of computing the optimality certificates for these subproblems exist, but first require the construction of a large data matrix. We show that this matrix can be constructed with complexity that remains linear in the number of landmarks and does not exceed the state-of-the-art computational complexity of one local solver iteration. We demonstrate the efficacy of the certificate on simulated and real-world landmark-based SLAM problems. We also integrate our method into the state-of-the-art SE-Sync pipeline to efficiently solve landmark-based SLAM problems to global optimality. Finally, we study the robustness of the global optimality certificate to measurement noise, taking into consideration the effect of the underlying measurement graph.
|
|
08:42-08:48, Paper WeAT14.3 | Add to My Program |
Place Recognition of Large-Scale Unstructured Orchards with Attention Score Maps |
|
Ou, Fang | Shanghai University |
Li, Yunhui | Shanghai University |
Li, Nan | Shanghai University |
Xu, Nan | School of Mechanical and Electronic Engineering, Shandong Jianzh |
Miao, Zhonghua | Shanghai University |
Keywords: Robotics and Automation in Agriculture and Forestry, Range Sensing, SLAM
Abstract: The availability of autonomous orchard robots could alleviate the conflict caused by rising labor costs and labor shortages. The essential technical requirements are autonomous localization and mapping which rely on place recognition to explore data associations. This paper presents a novel LiDAR-based place recognition algorithm for unstructured and large-scale orchards. Concretely, we propose a discriminative global representation, spatial binary pattern (SBP), that encodes three-dimensional (3D) spatial distributed scan into an eight-bit binary pattern. In addition, an efficient two-stage hierarchical re-identification process is proposed. The attention score map is introduced for task-relevant features and preliminary candidates retrieval. The overlap re-identification is used to align a pair of descriptors to confirm the final loop closure index. Experiments on orchard and public datasets have been conducted to evaluate the performance of the proposed method, our method achieves a higher recall rate and localization accuracy. Moreover, experiments on the longterm outdoor dataset KITTI further demonstrate the generality.
|
|
08:48-08:54, Paper WeAT14.4 | Add to My Program |
TwistSLAM++: Fusing Multiple Modalities for Accurate Dynamic Semantic SLAM |
|
Gonzalez, Mathieu | IRT B<>com |
Marchand, Eric | Univ Rennes, Inria, CNRS, IRISA |
Kacete, Amine | IRT B<>com |
Royan, Jerome | IRT B-Com |
Keywords: SLAM
Abstract: Most classical SLAM systems rely on the static scene assumption, which limits their applicability in real world scenarios. Recent SLAM frameworks have been proposed to simultaneously track the camera and moving objects. However they are often unable to estimate the canonical pose of the objects and exhibit a low object tracking accuracy. To solve this problem we propose TwistSLAM++, a semantic, dynamic, SLAM system that fuses stereo images and LiDAR information. Using semantic information, we track potentially moving objects and associate them to 3D object detections in LiDAR scans to obtain their pose and size. Then, we perform registration on consecutive object scans to refine object pose estimation. Finally, object scans are used to estimate the shape of the object and constrain map points to lie on the estimated surface within the bundle adjustment. We show on classical benchmarks that this fusion approach based on multimodal information improves the accuracy of object tracking.
|
|
08:54-09:00, Paper WeAT14.5 | Add to My Program |
SLAM and Shape Estimation for Soft Robots |
|
Karimi, Mohammad Amin | Illinois Institute of Technology |
Cańones Bonham, David Francesc | Illinois Institute of Technology |
Lopez, Esteban | Illinois Institute of Technology |
Srivastava, Ankit | Illinois Institute of Technology |
Spenko, Matthew | Illinois Institute of Technology |
Keywords: SLAM, Multi-Robot Systems, Modeling, Control, and Learning for Soft Robots
Abstract: This paper describes Simultaneous Localization and Mapping (SLAM) techniques for mobile soft robots using on-board local sensors. The paper focuses on planar boundary- constrained swarms, which are comprised of identical modular sub-units, each flexibly connected to its neighbor. The sub-units themselves are not necessarily soft, but as the robot’s size increases with respect to the size of the sub-units, the robot as a whole approaches a continuous system that exhibits the characteristics and behavior of a soft robot. Previous versions of this system have demonstrated grasping, shape formation, and tunneling; however, all prior embodiments have relied on external sensing for pose estimation. This paper is the first to demonstrate a fully self-sufficient boundary constrained swarm soft robot that does not rely on external pose estimation. The robot successfully navigates a maze-like environment while localizing and mapping the environment.
|
|
09:00-09:06, Paper WeAT14.6 | Add to My Program |
SemanticLoop: Loop Closure with 3D Semantic Graph Matching |
|
Yu, Junfeng | The Hong Kong University of Science and Technology |
Shen, Shaojie | Hong Kong University of Science and Technology |
Keywords: SLAM, Localization, Mapping
Abstract: Loop closure can effectively correct the accumulated error in robot localization, which plays a critical role in the long-term navigation of the robot. Traditional appearance-based methods rely on local features and are prone to failure in ambiguous environments. On the other hand, object recognition can infer objects' category, pose, and extent. These objects can serve as stable semantic landmarks for viewpoint-independent and non-ambiguous loop closure. However, there is a critical object-level data association problem due to the lack of efficient and robust algorithms. We introduce a novel object-level data association algorithm, which incorporates IoU, instance-level embedding, and detection uncertainty, formulated as a linear assignment problem. Then, we model the objects as TSDF volumes and represent the environment as a 3D graph with semantics and topology. Next, we propose a graph matching-based loop detection based on the reconstructed 3D semantic graphs and correct the accumulated error by aligning the matched objects. Finally, we refine the object poses and camera trajectory in an object-level pose graph optimization. Experimental results show that the proposed object-level data association method significantly outperforms the commonly used nearest neighbor method in accuracy. Our graph matching-based loop closure is more robust to environmental appearance changes than existing appearance-based methods.
|
|
09:06-09:12, Paper WeAT14.7 | Add to My Program |
Trajectory-Based SLAM for Indoor Mobile Robots with Limited Sensing Capabilities |
|
Chen, Yao | IRobot Corporation |
Rodriguez, Jeremias | IRobot |
Karimian, Arman | Boston University |
Pheil, Benjamin | IRobot Corporation |
Franco, Jose | IRobot Corp |
Moser, Renaud | IRobot Corp |
Sandstrom, Read | IRobot |
Lenser, Scott | IRobot Corporation |
Gritsenko, Artem | IRobot |
Tamino, Daniele | IRobot |
Tenaglia Giunta, Felipe Andres | IRobot |
Li, Guanlai | IRobot Corporation |
Wasserman, Philip | None |
Okerholm Huttlin, Andrea | IRobot |
Keywords: SLAM, Sensor Fusion, Localization
Abstract: In this paper we introduce a novel SLAM system for 2-D indoor environments that relies only on limited sensing. Our fully autonomous system uses only the trajectory of the robot around walls and objects in the environment as landmarks and is capable of robust and long-term exploration and mapping of a broad range of household floor plans. Rank-deficient and full-rank factors are created when the robot observes existing trajectory-based landmarks, and they are filtered and added in a pose graph, which is optimized periodically. The mission space is mapped by efficient adaptive local mapping algorithms. The proposed SLAM system has been extensively tested in various scenarios, and experimental results show its robustness and accuracy.
|
|
09:12-09:18, Paper WeAT14.8 | Add to My Program |
Graph-Based Robot Global Localization Informing Situational Graphs with Architectural Graphs |
|
Shaheer, Muhammad | University of Luxembourg |
Millan Romera, Jose Andres | University of Luxembourg |
Bavle, Hriday | University of Luxembourg |
Sanchez-Lopez, Jose Luis | Interdisciplinary Center for Security, Reliability and Trust (Sn |
Civera, Javier | Universidad De Zaragoza |
Voos, Holger | University of Luxembourg |
Keywords: SLAM, Localization, Legged Robots
Abstract: In this paper, we propose a solution for legged robot localization using architectural plans. Our specific contributions towards this goal are several. Firstly, we develop a method for converting the plan of a building into what we denote as an architectural graph (A-Graph). When the robot starts moving in an environment, we assume it has no knowledge about it, and it estimates an online situational graph representation (S-Graph) of its surroundings. We develop a novel graph-to-graph matching method, in order to relate the S-Graph estimated online from the robot sensors and the A-Graph extracted from the building plans. Note the challenge in this, as the S-Graph may show a partial view of the full A-Graph, their nodes are heterogeneous and their reference frames are different. After the matching, both graphs are aligned and merged, resulting in what we denote as an informed Situational Graph (iS-Graph), with which we achieve global robot localization and exploitation of prior knowledge from the building plans. Our experiments show that our pipeline shows a higher robustness and a significantly lower pose error than several LiDAR localization baselines.
|
|
09:18-09:24, Paper WeAT14.9 | Add to My Program |
SSGM: Spatial Semantic Graph Matching for Loop Closure Detection in Indoor Environments |
|
Tang, Yujie | Beijing Institute of Technology |
Wang, Meiling | Beijing Institute of Technology |
Deng, Yinan | Beijing Institute of Technology |
Yang, Yi | Beijing Institute of Technology |
Yue, Yufeng | Beijing Institute of Technology |
Keywords: Mapping, SLAM, Semantic Scene Understanding
Abstract: Capturing the semantics of objects and the topological relationship allows the robot to describe the scene more intelligently like a human and measure the similarity between scenes (loop closure detection) more accurately. However, many current semantic graph matching methods are based on walk descriptors, which only extract adjacency relations between objects. In such way, the comprehensive information in the semantic graph is not fully exploited, which may lead to false closed-loop detection. This paper proposes a novel spatial semantic graph matching method (SSGM) in indoor environments, which considers multifaceted information of the semantic graphs. Firstly, two semantic graphs are aligned in the same coordinate space contributed by the second-order spatial compatibility metric between objects and local graph features of objects in semantic graphs. Secondly, the similarity of the spatial distribution of overall semantic graphs is further evaluated. The proposed algorithm is validated on public datasets and compared with the latest semantic graph matching methods, demonstrating improved accuracy and efficiency in loop closure detection. The code is available at https://github.com/BIT-TYJ/SSGM.
|
|
09:24-09:30, Paper WeAT14.10 | Add to My Program |
Training-Free Attentive-Patch Selection for Visual Place Recognition |
|
Zhang, Dongshuo | Nanyang Technological University |
Wu, Meiqing | Nanyang Technological University |
Lam, Siew Kei | Nanyang Technological University |
Keywords: SLAM, Localization, Mapping
Abstract: Visual Place Recognition (VPR) utilizing patch descriptors from Convolutional Neural Networks (CNNs) has shown impressive performance in recent years. Existing works either perform exhaustive matching of all patch descriptors, or employ complex networks to select good candidate patches for further geometric verification. In this work, we develop a novel two-step training-free patch selection method that is fast, while being robust to large occlusions and extreme viewpoint variations. In the first step, a self-attention mechanism is used to select sparse and evenly distributed discriminative patches in the query image. Next, a novel spatial-matching method is used to rapidly select corresponding patches with high similar appearances between the query and each reference image. The proposed method is inspired by how humans perform place recognition by first identifying prominent regions in the query image, and then relying on back-and-forth visual inspection of the query and reference image to attentively identify similar regions while ignoring dissimilar ones. Extensive experiment results show that our proposed method outperforms state-of-the-art (SOTA) methods in both place recognition precision and runtime, on various challenging conditions.
|
|
09:30-09:36, Paper WeAT14.11 | Add to My Program |
Exact Point Cloud Downsampling for Fast and Accurate Global Trajectory Optimization |
|
Koide, Kenji | National Institute of Advanced Industrial Science and Technology |
Oishi, Shuji | National Institute of Advanced Industrial Science and Technology |
Yokozuka, Masashi | Nat. Inst. of Advanced Industrial Science and Technology |
Banno, Atsuhiko | National Instisute of Advanced Industrial Science and Technology |
Keywords: SLAM, Mapping, Localization
Abstract: This paper presents a point cloud downsampling algorithm for fast and accurate trajectory optimization based on global registration error minimization. The proposed algorithm selects a weighted subset of residuals of the input point cloud such that the subset yields exactly the same quadratic point cloud registration error function as that of the original point cloud at the evaluation point. This method accurately approximates the original registration error function with only a small subset of input points (29 residuals at a minimum). Experimental results using the KITTI dataset demonstrate that the proposed algorithm significantly reduces processing time (by 87%) and memory consumption (by 99%) for global registration error minimization while retaining accuracy.
|
|
09:36-09:42, Paper WeAT14.12 | Add to My Program |
GAPSLAM: Blending Gaussian Approximation and Particle Filters for Real-Time Non-Gaussian SLAM |
|
Huang, Qiangqiang | Massachusetts Institute of Technology |
Leonard, John | MIT |
Keywords: SLAM, Probabilistic Inference
Abstract: Inferring the posterior distribution in SLAM is critical for evaluating the uncertainty in localization and mapping, as well as supporting subsequent planning tasks aiming to reduce uncertainty for safe navigation. However, real-time full posterior inference techniques, such as Gaussian approximation and particle filters, either lack expressiveness for representing non-Gaussian posteriors or suffer from performance degeneracy when estimating high-dimensional posteriors. Inspired by the complementary strengths of Gaussian approximation and particle filters—scalability and non-Gaussian estimation, respectively—we blend these two approaches to infer marginal posteriors in SLAM. Specifically, Gaussian approximation provides robot pose distributions on which particle filters are conditioned to sample landmark marginals. In return, the maximum a posteriori point among these samples can be used to reset linearization points in the nonlinear optimization solver of the Gaussian approximation, facilitating the pursuit of global optima. We demonstrate the scalability, generalizability, and accuracy of our algorithm for real-time full posterior inference on realworld range-only SLAM and object-based bearing-only SLAM datasets.
|
|
09:42-09:48, Paper WeAT14.13 | Add to My Program |
Multi-Scale Point Octree Encoding Network for Point Cloud Based Place Recognition |
|
Tang, Zhilong | Southern University of Science and Techology |
Ye, Hanjing | Southern University of Science and Technology |
Zhang, Hong | SUSTech |
Keywords: SLAM
Abstract: In the past decades, point cloud-based place recog-nition has attracted much attention. In this paper, we propose a novel network named Multi-scale Point Octree Encoding Net-work (MPOE-Net) to learn a discriminative global descriptor of a place for retrieval. The point octree encoding module captures local information of each point from its nearest and farthest neighbors. We then use a multi-transformer network to adaptively learn the local relationship for each point by using a novel grouped offset-attention network. A multi-NetVLAD layer is employed to aggregate multi-scale attention maps into a global descriptor. Experiments on various benchmark datasets demonstrate that our method achieves the state-of-the-art in the point cloud-based place recognition task.Our code is released publicly at https://github.com/Zhilong-Tang/MPOE-Net.
|
|
09:48-09:54, Paper WeAT14.14 | Add to My Program |
Analytical Jacobian Approximation for Direct Optimization of a Trajectory of Interpolated Poses on SE(3) |
|
Botashev, Kazii | Skolkovo Institute of Science and Technology (Skoltech) |
Ferrer, Gonzalo | Skolkovo Institute of Science and Technology |
Keywords: SLAM, Localization, Sensor Fusion
Abstract: This paper relates to time-continuous trajectory representation using direct linear interpolation on SE(3). Our approach focuses on a novel analytical Jacobian approximation of a sequence of linearly interpolated poses on SE(3). This paper shows a derivation of the proposed analytical Jacobian using retraction mapping and an approximation to the commutativity property of infinitesimal group elements. We provide plenty of evaluations for 3 different optimization problems. For the synthetic point cloud alignment problem, our proposed Jacobian is compared with a numerical one. For the synthetic pose graph optimization problem, the proposed Jacobian approximation allows us to reduce by x7 factor the state dimensions while keeping a similar magnitude of resulting error compared to the full discrete-time trajectory. Finally, we show the validity of our approach in a time-continuous approach for real-world LIDAR odometry problem.
|
|
WeAT15 Regular session, 321 |
Add to My Program |
Multi-Robot and Distributed Robot Systems I |
|
|
Chair: Tzoumas, Vasileios | University of Michigan, Ann Arbor |
Co-Chair: Mina, Tamzidul | Sandia National Laboratories |
|
08:30-08:36, Paper WeAT15.1 | Add to My Program |
On Cyber-Attacks Mitigation for Distributed Trajectory Generators |
|
Al-Rawashdeh, Yazan | Memorial University of Newfoundland |
Al Janaideh, Mohammad | University of Guelph |
Keywords: Multi-Robot Systems
Abstract: In this paper, an immune average consensus behavior of distributed trajectory generators given in the form of a multi-agent system are presented. Starting with the well-known results of linear consensus protocols, we propose a decomposition of the invariant consensus value to enable a distributed cyber-attacks detection and mitigation mechanism among the connected agents over mainly undirected communication links. This decomposition suggests one preferred propagation of the invariant quantity along communication links of the multi-agent systems under study. Despite its simplicity, the effectiveness of this mechanism in detecting and mitigating various types of cyber-attacks is evident through a numerical simulation. Interestingly, the resulting defense mechanism will not be passive, rather it can initiate its counter-attack measures by pretending that the attack process was a success. Moreover, the trajectory generators can operate under stealth mode where the communication links get silenced or totally disconnected without affecting the intended behavior after having the consensus value locked.
|
|
08:36-08:42, Paper WeAT15.2 | Add to My Program |
Distributed Framework Matching (I) |
|
Cao, Kun | Nanyang Technological University |
Li, Xiuxian | Tongji University |
Xie, Lihua | NanyangTechnological University |
Keywords: Multi-Robot Systems, Distributed Robot Systems, Task Planning, Object Matching
Abstract: This paper studies the problem of distributed framework matching (FM), which originates from the assignment task in multi-robot coordination and the matching task in pattern recognition. The objective of distributed FM is to distributively seek a correspondence which minimizes some metrics describing the disagreement between two frameworks (i.e., graphs and their embeddings). In view of the type of the underlying graph in the framework, two formulations, undirected framework matching (UFM) and directed framework matching (DFM), and their convex relaxations, relaxed UFM (RUFM) and relaxed DFM (RDFM), are presented. UFM is converted into a graph matching (GM) problem with the adjacency matrix being replaced by a matrix constructed from the undirected framework under certain graphical conditions, and can be solved distributively. Sufficient conditions for the equivalence between UFM and RUFM, and the perturbation admitting exact recovery of correspondence are established. On the other hand, DFM embeds the configuration of the directed framework via another type of matrix, whose computation is distributed, and can deal with the case of two frameworks with different s
|
|
08:42-08:48, Paper WeAT15.3 | Add to My Program |
Online Submodular Coordination with Bounded Tracking Regret: Theory, Algorithm, and Applications to Multi-Robot Coordination |
|
Xu, Zirui | University of Michigan |
Zhou, Hongyu | University of Michigan |
Tzoumas, Vasileios | University of Michigan, Ann Arbor |
Keywords: Multi-Robot Systems, Planning under Uncertainty, Optimization and Optimal Control
Abstract: We enable efficient and effective coordination in unpredictable environments, i.e., in environments whose future evolution is unknown a priori and even adversarial. We are motivated by the future of autonomy that involves multiple robots coordinating in dynamic, unstructured, and adversarial environments to complete complex tasks such as target tracking, environmental mapping, and area monitoring. Such tasks are often modeled as submodular maximization coordination problems. We introduce the first submodular coordination algorithm with bounded tracking regret, i.e., with bounded suboptimality with respect to optimal time-varying actions that know the future a priori. The bound gracefully degrades with the environments’ capacity to change adversarially. It also quantifies how often the robots must re-select actions to “learn” to coordinate as if they knew the future a priori. The algorithm requires the robots to select actions sequentially based on the actions selected by the previous robots in the sequence. Particularly, the algorithm generalizes the seminal Sequential Greedy algorithm by Fisher et al. to unpredictable environments, leveraging submodularity and algorithms for the problem of tracking the best expert. We validate our algorithm in simulated scenarios of target tracking.
|
|
08:48-08:54, Paper WeAT15.4 | Add to My Program |
Robust Task Scheduling for Heterogeneous Robot Teams under Capability Uncertainty (I) |
|
Fu, Bo | University of Michigan |
Smith, William | US Army TARDEC |
Rizzo, Denise M. | U.S. Army TARDEC |
Castanier, Matthew P. | US Army DEVCOM GVSC |
Ghaffari, Maani | University of Michigan |
Barton, Kira | University of Michigan at Ann Arbor |
Keywords: Planning, Scheduling and Coordination, Path Planning for Multiple Mobile Robots or Agents, Multi-Robot Systems, Task allocation
Abstract: This paper develops a stochastic programming framework for multi-agent systems where task decomposition, assignment, and scheduling problems are simultaneously optimized. The framework can be applied to heterogeneous mobile robot teams with distributed sub-tasks. Examples include pandemic robotic service coordination, explore and rescue, and delivery systems with heterogeneous vehicles. Due to their inherent flexibility and robustness, multi-agent systems are applied in a growing range of real-world problems that involve heterogeneous tasks and uncertain information. Most previous works assume one fixed way to decompose a task into roles that can later be assigned to the agents. This assumption is not valid for a complex task where the roles can vary and multiple decomposition structures exist. Meanwhile, it is unclear how uncertainties in task requirements and agent capabilities can be systematically quantified and optimized under a multi-agent system setting. A representation for complex tasks is proposed: agent capabilities are represented as a vector of random distributions, and task requirements are verified by a generalizable binary function. The conditional value at risk (CVaR) is chosen as a metric in the objective function to generate robust plans. An efficient algorithm is described to solve the model, and the whole framework is evaluated in two different practical test cases: capture-the-flag and robotic service coordination during a pandemic (e.g., COVID-19). Results demonstrate that the framework is generalizable, scalable up to 140 agents and 40 tasks for the example test cases, and provides low-cost plans that ensure a high probability of success.
|
|
08:54-09:00, Paper WeAT15.5 | Add to My Program |
CAMETA: Conflict-Aware Multi-Agent Estimated Time of Arrival Prediction for Mobile Robots |
|
le Fevre Sejersen, Jonas | Aarhus University |
Kayacan, Erdal | Paderborn University |
Keywords: Multi-Robot Systems, Autonomous Agents, Logistics
Abstract: This study presents the conflict-aware multi-agent estimated time of arrival (CAMETA) framework, a novel approach for predicting the arrival times of multiple agents in unstructured environments without predefined road infrastructure. The CAMETA framework consists of three components: a path planning layer generating potential path suggestions, a multi-agent ETA prediction layer predicting the arrival times for all agents based on the paths, and lastly, a path selection layer that calculates the accumulated cost and selects the best path. The novelty of the CAMETA framework lies in the heterogeneous map representation and the heterogeneous graph neural network architecture. As a result of the proposed novel structure, CAMETA improves the generalization capability compared to the state-of-the-art methods that rely on structured road infrastructure and historical data. The simulation results demonstrate the efficiency and efficacy of the multi-agent ETA prediction layer, with a mean average percentage error improvement of 29.5% and 44% when compared to a traditional path planning method (A^*) which does not consider conflicts. The performance of the CAMETA framework shows significant improvements in terms of robustness to noise and conflicts as well as determining proficient routes compared to state-of-the-art multi-agent path planners.
|
|
09:00-09:06, Paper WeAT15.6 | Add to My Program |
Nonlinear Heterogeneous Bayesian Decentralized Data Fusion |
|
Dagan, Ofer | University of Colorado Boulder |
Ahmed, Nisar | University of Colorado Boulder |
Cinquini, Tycho | Cooperative Human-Robot Intelligence Lab at CU Boulder |
Keywords: Multi-Robot Systems, Cooperating Robots, Distributed Robot Systems
Abstract: The factor graph decentralized data fusion (FG-DDF) framework was developed for the analysis and exploitation of conditional independence in heterogeneous Bayesian decentralized fusion problems, in which robots update and fuse pdfs over different, but overlapping subsets of random states. This allows robots to efficiently use smaller probabilistic models and sparse message passing to accurately and scalably fuse relevant local parts of a larger global joint state pdf while accounting for data dependencies between robots. Whereas prior work required limiting assumptions about network connectivity and model linearity, this paper relaxes these to explore the applicability and robustness of FG-DDF in more general settings. We develop a new heterogeneous fusion rule which generalizes the homogeneous covariance intersection algorithm for such cases and test it in multi-robot tracking and localization scenarios with non-linear motion/observation models under communication dropouts. Simulation and hardware experiments show that, in practice, the FG-DDF continues to provide consistent filtered estimates under these more practical operating conditions, while reducing computation and communication costs by more than 99%, thus enabling the design of scalable real-world multi-robot systems.
|
|
09:06-09:12, Paper WeAT15.7 | Add to My Program |
BRNES: Enabling Security and Privacy-Aware Experience Sharing in Multiagent Robotic and Autonomous Systems |
|
Hossain, Md Tamjid | University of Nevada, Reno |
La, Hung | University of Nevada at Reno |
Badsha, Shahriar | Bosch Engineering, North America |
Netchaev, Anton | USACE ERDC |
Keywords: Multi-Robot Systems, Autonomous Agents, Cooperating Robots
Abstract: Although experience sharing (ES) accelerates multiagent reinforcement learning (MARL) in an advisor-advisee framework, attempts to apply ES to decentralized multiagent systems (MAS) have so far relied on trusted environments and overlooked the possibility of adversarial manipulation and inference. Nevertheless, in a real-world setting, some Byzantine attackers, disguised as advisors, may provide false advice to the advisee and catastrophically degrade the overall learning performance. Also, an inference attacker, disguised as an advisee, may conduct several queries to infer the advisors' private information and make the entire ES process questionable in terms of privacy leakage. To address and tackle these issues, we propose a novel MARL framework (BRNES) that heuristically selects a dynamic neighbor zone for each advisee at each learning step and adopts a weighted experience aggregation technique to reduce Byzantine attack impact. Furthermore, to keep the agent's private information safe from adversarial inference attacks, we leverage the local differential privacy (LDP)-induced noise during the ES process. Our experiments show that our framework outperforms the state-of-the-art in terms of the steps to goal, obtained reward, and time to goal metrics. Particularly, our evaluation shows that the proposed framework is 8.32x faster than the current non-private frameworks and 1.41x faster than the private frameworks in an adversarial setting.
|
|
09:12-09:18, Paper WeAT15.8 | Add to My Program |
Beacon-Based Distributed Structure Formation in Multi-Agent Systems |
|
Mina, Tamzidul | Sandia National Laboratories |
Jo, Wonse | Purdue University |
Kannan, Shyam Sundar | Purdue University |
Min, Byung-Cheol | Purdue University |
Keywords: Multi-Robot Systems, Cooperating Robots, Robotics and Automation in Construction
Abstract: Autonomous shape and structure formation is an important problem in the domain of large-scale multi-agent systems. In this paper, we propose a 3D structure representation method and a distributed structure formation strategy where settled agents guide free moving agents to a prescribed location to settle in the structure. Agents at the structure formation frontier looking for neighbors to settle act as beacons, generating a surface gradient throughout the formed structure propagated by settled agents. Free-moving agents follow the surface gradient along the formed structure surface to the formation frontier, where they eventually reach the closest beacon and settle to continue the structure formation following a local bidding process. Agent behavior is governed by a finite state machine implementation, along with potential field-based motion control laws. We also discuss appropriate rules for recovering from stagnation points. Simulation experiments are presented to show planar and 3D structure formations with continuous and discontinuous boundary/surfaces, which validate the proposed strategy, followed by a scalability analysis.
|
|
09:18-09:24, Paper WeAT15.9 | Add to My Program |
Decentralized Swarm Trajectory Generation for LiDAR-Based Aerial Tracking in Cluttered Environments |
|
Yin, Longji | The University of Hong Kong |
Zhu, Fangcheng | The University of Hong Kong |
Ren, Yunfan | The University of Hong Kong |
Kong, Fanze | The University of Hong Kong |
Zhang, Fu | University of Hong Kong |
Keywords: Multi-Robot Systems, Aerial Systems: Applications, Motion and Path Planning
Abstract: Aerial tracking with multiple unmanned aerial vehicles (UAVs) has wide potential in various applications. However, the existing works for swarm tracking typically lack the capability of maintaining high target visibility in cluttered environments. To address this deficiency, we present a decentralized planner that maximizes target visibility while ensuring collision-free maneuvers for swarm tracking. In this paper, each drone's tracking performance is first analyzed by a decentralized kinodynamic searching front-end, which renders an optimal guiding path to initialize safe flight corridors and visible sectors. Afterwards, a polynomial trajectory satisfying the corridor constraints is generated by a spatial-temporal optimizer. Inter-vehicle collision and occlusion avoidance are also incorporated into the optimization objectives. The advantages of our methods are verified by extensive benchmark comparisons against other cutting-edge works. Integrated with an autonomous LiDAR-based swarm system, the proposed planner demonstrates its efficiency and robustness in real-world experiments with unknown cluttered surroundings.
|
|
09:24-09:30, Paper WeAT15.10 | Add to My Program |
Decentralized Planning for Car-Like Robotic Swarm in Cluttered Environments |
|
Ma, Changjia | Zhejiang University |
Han, Zhichao | Zhejiang University |
Zhang, Tingrui | Zhejiang University |
Wang, Jingping | Zhejiang University |
Xu, Long | Zhejiang University |
Li, Chengyang | Southern University of Science and Technology |
Xu, Chao | Zhejiang University |
Gao, Fei | Zhejiang University |
Keywords: Multi-Robot Systems, Swarm Robotics, Motion and Path Planning
Abstract: Robot swarm is a hot spot in robotic research community. In this paper, we propose a decentralized framework for car-like robotic swarm which is capable of real-time planning in cluttered environments. In this system, path finding is guided by environmental topology information to avoid frequent topological change, and search-based speed planning is leveraged to escape from infeasible initial value’s local minima. Then spatialtemporal optimization is employed to generate a safe, smooth and dynamically feasible trajectory. During optimization, the trajectory is discretizd by fixed time steps. Penalty is imposed on the signed distance between agents to realize collision avoidance, and differential flatness cooperated with limitation on front steer angle satisfies the non-holonomic constraints. With trajectories broadcast to the wireless network, agents are able to check and prevent from potential collisions. We validate the robustness of our system in simulation and real-world experiments. Code will be released as open-source packages.
|
|
09:30-09:36, Paper WeAT15.11 | Add to My Program |
SCRIMP: Scalable Communication for Reinforcement and Imitation-Learning-Based Multi-Agent Pathfinding |
|
Wang, Yutong | National University of Singapore |
Xiang, Bairan | National University of Singapore |
Huang, Shinan | National University of Singapore |
Sartoretti, Guillaume Adrien | National University of Singapore (NUS) |
Keywords: Path Planning for Multiple Mobile Robots or Agents, Reinforcement Learning, Multi-Robot Systems
Abstract: Trading off performance guarantees in favor of scalability, the Multi-Agent Path Finding (MAPF) community has recently started to embrace Multi-Agent Reinforcement Learning (MARL), where agents learn to collaboratively generate individual, collision-free (but often suboptimal) paths. Scalability is usually achieved by assuming a local field of view (FOV) around the agents, helping scale to arbitrary world sizes. However, this assumption significantly limits the amount of information available to the agents, making it difficult for them to enact the type of joint maneuvers needed in denser MAPF tasks. In this paper, we propose SCRIMP, where agents learn individual policies from even very small (down to 3x3) FOVs, by relying on a highly-scalable global/local communication mechanism based on a modified transformer. We further equip agents with a state-value-based tie-breaking strategy to further improve performance in symmetric situations, and introduce intrinsic rewards to encourage exploration while mitigating the long-term credit assignment problem. Empirical evaluations on a set of experiments indicate that SCRIMP can achieve higher performance with improved scalability compared to other state-of-the-art learning-based MAPF planners with larger FOVs, and even yields similar performance as a classical centralized planner in many cases. Ablation studies further validate the effectiveness of our proposed techniques. Finally, we show that our trained model can be directly implemented on real robots for online MAPF through high-fidelity simulations in gazebo.
|
|
09:36-09:42, Paper WeAT15.12 | Add to My Program |
Distributed Model Predictive Formation Control of Robots with Sampled Trajectory Sharing in Cluttered Environments |
|
Satır, Muhlis Sami | METU, Aselsan Inc |
Aktas, Yasin Furkan | TOBB University of Economics and Technology |
Atasoy, Simay | Middle East Technical University |
Ankarali, Mustafa Mert | Middle East Technical University |
Sahin, Erol | Middle East Technical University |
Keywords: Distributed Robot Systems, Swarm Robotics, Motion Control
Abstract: In this paper, we propose a Model Predictive Control (MPC) based distributed formation control method for a multi-robot system (MRS) that would move them among dynamic obstacles to a desired goal position. Specifically, after formulating the formation control, as a distributed version of MPC, we propose and evaluate three information-sharing schemes within the MRS; namely sharing (i) positions, (ii) complete predicted trajectories, and (iii) exponentially-sampled predicted trajectories. Using a simplified kinematic model for robots, we conducted systematic simulation experiments in (a) scenarios, where the robots are instructed to switch places, as one of the most challenging forms of formation changes, and in (b) scenarios where robots are instructed to reach a goal, within environments containing dynamic obstacles. In a set of systematic experiments conducted in simulation and with mini quadcopters, we have shown that sharing of exponentially-sampled trajectories (as opposed to positions, or complete trajectories) among the robots provides near-optimal paths while decreasing the required computation cost and communication bandwidth. Surprisingly, in the presence of noise, sharing exponentially-sampled trajectories among the robots decreased the variance in the final paths. The proposed method is demonstrated on a group of Crazyflie quadcopters.
|
|
WeAT16 Regular session, 330A |
Add to My Program |
Learning for Navigation |
|
|
Chair: Kwon, Jaerock | University of Michigan-Dearborn |
Co-Chair: Hayashibe, Mitsuhiro | Tohoku University |
|
08:30-08:36, Paper WeAT16.1 | Add to My Program |
Curriculum Reinforcement Learning from Avoiding Collisions to Navigating among Movable Obstacles in Diverse Environments |
|
Wang, Hsueh-Cheng | National Yang Ming Chiao Tung University, Taiwan |
Huang, Siao-Cing | National Yang Ming Chiao Tung University |
Huang, Po-Jui | National Chiao Tung University |
Wang, Kuo-Lun | National Chiao Tung University |
Teng, Yi-Chen | NYCU |
Ko, Yu-Ting | National Yang Ming Chiao Tung University |
Jeon, Dongsuk | Seoul National University |
Wu, I-Chen | National Chiao Tung University |
Keywords: Search and Rescue Robots, Reinforcement Learning, Transfer Learning
Abstract: Curriculum learning has proven highly effective to speed up training convergence with improved performance in a variety of tasks. Researchers have been studying how a curriculum can be constituted to train reinforcement learning (RL) agents in various application domains. However, discovering curriculum sequencing requires the ranking of sub-tasks or samples in order of difficulty, which is not yet sufficiently studied for robot navigation problems. It is still an open question what navigation strategies can be learned and transferred during multi-stage transfer learning from easy to hard. Furthermore, despite of some attempts of learning real robot manipulation tasks using curriculum, most of existing works are limited to toy or simulated settings rather than realistic scenarios. To address those issues, we first investigated how the model convergence in diverse environments relates to the navigation strategies and difficulty metrics. We found that only some of the environments can be trained from scratch, such as in a relatively open tunnel-like environment that only required wall following. We then carried out a two-stage transfer learning for more difficult environments. We found such approach effective for goal navigation, but failed for more complex tasks where movable obstacles may be on the navigation path. To facilitate more complex policies in the navigation among movable obstacles (NAMO) task, another curriculum with distance and pace functions appropriate to the difficulty of the environment was developed. The proposed scheme was proved effective and the strategies learned were discussed via comprehensive evaluations conducted in simulated and real environments. Supplementary materials can be found at arg-nctu.github.io/proj
|
|
08:36-08:42, Paper WeAT16.2 | Add to My Program |
Control Transformer: Robot Navigation in Unknown Environments through PRM-Guided Return-Conditioned Sequence Modeling |
|
Lawson, Daniel | Purdue University |
Qureshi, Ahmed H. | Purdue University |
Keywords: Learning from Experience, Reinforcement Learning, Deep Learning Methods
Abstract: Learning long-horizon tasks such as navigation has presented difficult challenges for successfully applying reinforcement learning to robotics. From another perspective, under known environments, sampling-based planning can robustly find collision-free paths in environments without learning. In this work, we propose Control Transformer that models return conditioned sequences from low-level policies guided by a sampling-based Probabilistic Roadmap (PRM) planner. We demonstrate that our framework can solve long-horizon navigation tasks using only local information. We evaluate our approach on partially-observed maze navigation with MuJoCo robots, including Ant, Point, and Humanoid. We show that Control Transformer can successfully navigate through mazes and transfer to unknown environments. Additionally, we apply our method to a differential drive robot (Turtlebot3) and show zero-shot sim2real transfer under noisy observations.
|
|
08:42-08:48, Paper WeAT16.3 | Add to My Program |
InteractionNet: Joint Planning and Prediction for Autonomous Driving with Transformers |
|
Fu, Jiawei | Institute of Artificial Intelligence and Robotics |
Shen, Yanqing | Xi'an Jiaotong University |
Jian, Zhiqiang | Xi'an Jiaotong University |
Chen, Shitao | Xi'an Jiaotong University |
Xin, Jingmin | Xi'an Jiaotong University |
Zheng, Nanning | Xi'an Jiaotong University |
Keywords: Imitation Learning, Autonomous Vehicle Navigation
Abstract: Planning and prediction are two important modules of autonomous driving and have experienced tremendous advancement recently. Nevertheless, most existing methods regard planning and prediction as independent and ignore the correlation between them, leading to the lack of consideration for interaction and dynamic changes of traffic scenarios. To address this challenge, we propose InteractionNet, which leverages transformer to share global contextual reasoning among all traffic participants to capture interaction and interconnect planning and prediction to achieve joint. Besides, InteractionNet deploys another transformer to help the model pay extra attention to the perceived region containing critical or unseen vehicles. InteractionNet outperforms other baselines in several benchmarks, especially in terms of safety, which benefits from the joint consideration of planning and forecasting. The code will be available at https://github.com/fujiawei0724/InteractionNet.
|
|
08:48-08:54, Paper WeAT16.4 | Add to My Program |
ANEC: Adaptive Neural Ensemble Controller for Mitigating Latency Problems in Vision-Based Autonomous Driving |
|
Khalil, Aws | University of Michigan-Dearborn |
Kwon, Jaerock | University of Michigan-Dearborn |
Keywords: Autonomous Vehicle Navigation, Machine Learning for Robot Control, Imitation Learning
Abstract: Humans have latency in their visual perception system between observation and action. Any action we take is based on an earlier observation since, by the time we act, the state has already changed, and we got a new observation. In autonomous driving, this latency is also present, determined by the amount of time the control algorithm needs to process information before acting. This algorithmic perception latency can be reduced by massive computing power via GPUs and FPGAs, which is improbable in automobile platforms. Thus, it is a reasonable assumption that the algorithmic perception latency is inevitable. Many researchers have developed different neural network driving models without consideration of the algorithmic perception latency. This paper studies the latency effect on vision-based neural network autonomous driving in the lane-keeping task and proposes a vision-based novel neural network controller, the Adaptive Neural Ensemble Controller (ANEC) that is inspired by the near/far gaze distribution of human drivers during lane-keeping. ANEC was tested in Gazebo 3D simulation environment with Robot Operating System (ROS) which showed the effectiveness of ANEC in dealing with algorithmic latency. The source code is available at https://github.com/jrkwon/oscar/tree/devel_anec.
|
|
08:54-09:00, Paper WeAT16.5 | Add to My Program |
A Dynamic Programming Algorithm for Grid-Based Formation Planning of Multiple Vehicles |
|
Au, Tsz-Chiu | Ulsan National Institute of Science and Technology |
Keywords: Intelligent Transportation Systems, Path Planning for Multiple Mobile Robots or Agents, Planning, Scheduling and Coordination
Abstract: A common operation in multirobot systems is to generate a motion plan for multiple robots such that the robots can move in formation to achieve some desired effects. For example, in autonomous parking lots, a group of vehicles can be asked to move to another location when they block another vehicle that needs to leave the parking lot. In this paper, we present a novel grid-based planning approach for motion planning that minimizes the makespan of moving multiple vehicles from one location to another in a safe manner. Unlike most existing multirobot planning algorithms, our algorithm uses dynamic programming to compute a nearly-optimal motion plan for a large group of vehicles in polynomial time with the help of a given set of intermediate vehicle patterns. Our experimental results show that our algorithm is much faster than an exact algorithm but does not increase the minimum makespans tremendously.
|
|
09:00-09:06, Paper WeAT16.6 | Add to My Program |
LB-L2L-Calib 2.0: A Novel Online Extrinsic Calibration Method for Multiple Long Baseline 3D LiDARs Using Objects |
|
Zhang, Jun | Nanyang Technological University |
Yan, Qiao | Nanyang Technological University |
Wen, Mingxing | China-Singapore International Joint Research Center |
Lyu, Qiyang | Nanyang Technological University |
Peng, Guohao | Nanyang Technological University |
Wu, Zhenyu | Nanyang Technological University |
Wang, Danwei | Nanyang Technological University |
Keywords: Intelligent Transportation Systems, Sensor Fusion, Calibration and Identification
Abstract: In V2X (Vehicle-to-Everything), one important work is to extrinsically calibrate multiple 3D LiDARs, which are mounted with a long baseline and large viewpoint-difference at the road-side. Current solutions either require a specific target being set up (e.g., a sphere), or require specific features existing in the environment (e.g., mutually orthogonal planes). However, it is time-consuming, sometimes even inconvenient, to set up specific targets, e.g., at busy intersections and highways. Furthermore, specific features do not always exist in the traffic scenario. Thus, the current solutions are not applicable. To address this problem, a novel extrinsic calibration method is proposed in this paper, namely LB-L2L-Calib 2.0. It is the 2.0 version of our previous work. The novelties are: 1) We propose to use the easily-accessible objects on the road as features for calibration (i.e., the vehicles). Thus, it is not necessary to set up any specific targets and we do not need to worry whether specific features exist or not. The key point is we observed that the 3D bounding box centers of the vehicles are viewpoint-invariant from different viewpoints, which makes them ideal features for long baseline and large viewpoint-difference calibration. 2) To establish correct correspondence between the bounding box centers detected from different LiDARs, we propose an exhaustive searching strategy. It can robustly output correct correspondence. Extensive experiments are performed in three scenarios (simulation: intersection, real: carpark and highway), with two types of LiDAR (Velodyne and Livox), demonstrating that LB-L2L-Calib 2.0 is robust, effective, and accurate.
|
|
09:06-09:12, Paper WeAT16.7 | Add to My Program |
A GM-PHD Filter with Estimation of Probability of Detection and Survival for Individual Targets |
|
Perera, R.A. Thivanka | University of Rhode Island |
Jeong, Mingi | Dartmouth College |
Quattrini Li, Alberto | Dartmouth College |
Stegagno, Paolo | University of Rhode Island |
Keywords: Probability and Statistical Methods, Autonomous Vehicle Navigation, Marine Robotics
Abstract: This paper proposes a modification of the Gaussian mixture probability hypothesis density (GM-PHD) filter to compute online the probability of detection ( PD) and probability of survival ( PS) of targets. This eliminates the need for predetermined and/or constant PD and PS values, that may degrade the estimation. The proposed filter estimates the PD and PS values for each individual target based on newly introduced parameters, which are updated during the measurement update process. The effectiveness of the proposed filter was validated through an in-lab experiment using four unmanned ground robots with varying PD values and a real-world lidar-based obstacle tracking system implemented on an Automated Surface Vehicle operating in a lake with real-time boat traffic. The results of the experiments demonstrate that the proposed filter outperforms the standard PHD filter with incorrect PD and PS values. These findings highlight the potential benefits of the proposed filter in improving target tracking performance in complex environments.
|
|
09:12-09:18, Paper WeAT16.8 | Add to My Program |
F2BEV: Bird's Eye View Generation from Surround-View Fisheye Camera Images for Automated Driving |
|
Samani, Ekta | University of Washington |
Tao, Feng | University of Texas at San Antonio |
Dasari, Harshavardhan Reddy | Volvo Cars Technology USA LLC |
Ding, Sihao | Volvo Cars |
Banerjee, Ashis | University of Washington |
Keywords: Intelligent Transportation Systems, Computer Vision for Transportation
Abstract: Bird's Eye View (BEV) representations are tremendously useful for perception-related automated driving tasks. However, generating BEVs from surround-view fisheye camera images is challenging due to the strong distortions introduced by such wide-angle lenses. We take the first step in addressing this challenge and introduce a baseline, F2BEV, to generate discretized BEV height maps and BEV semantic segmentation maps from fisheye images. F2BEV consists of a distortion-aware spatial cross attention module for querying and consolidating spatial information from fisheye image features in a transformer-style architecture followed by a task-specific head. We evaluate single-task and multi-task variants of F2BEV on our synthetic FB-SSEM dataset, all of which generate better BEV height and segmentation maps (in terms of the IoU) than a state-of-the-art BEV generation method operating on undistorted fisheye images. We also demonstrate discretized height map generation from real-world fisheye images using F2BEV. Our dataset is publicly available at https://github.com/volvo-cars/FB-SSEM-dataset
|
|
09:18-09:24, Paper WeAT16.9 | Add to My Program |
One-4-All: Neural Potential Fields for Embodied Navigation |
|
Morin, Sacha | Université De Montréal, Mila |
Saavedra, Miguel | Université De Montréal |
Paull, Liam | Université De Montréal |
Keywords: Vision-Based Navigation, Deep Learning Methods, Representation Learning
Abstract: A fundamental task in robotics is to navigate between two locations. In particular, real-world navigation can require long-horizon planning using high-dimensional RGB images, which poses a substantial challenge for end-to-end learning-based approaches. Current semi-parametric methods instead achieve long-horizon navigation by combining learned modules with a topological memory of the environment, often represented as a graph over previously collected images. However, using these graphs in practice requires tuning a number of pruning heuristics. These heuristics are necessary to avoid spurious edges, limit runtime memory usage and maintain reasonably fast graph queries in large environments. In this work, we present One-4-All (O4A), a method leveraging self-supervised and manifold learning to obtain a graph-free, end-to-end navigation pipeline in which the goal is specified as an image. Navigation is achieved by greedily minimizing a potential function defined continuously over image embeddings. Our system is trained offline on non-expert exploration sequences of RGB data and controls, and does not require any depth or pose measurements. We show that O4A can reach long-range goals in 8 simulated Gibson indoor environments and that resulting embeddings are topologically similar to ground truth maps, even if no pose is observed. We further demonstrate successful real-world navigation using a Jackal UGV platform.
|
|
09:24-09:30, Paper WeAT16.10 | Add to My Program |
Communication Resources Constrained Hierarchical Federated Learning for End-To-End Autonomous Driving |
|
Kou, Wei-Bin | The University of Hong Kong |
Wang, Shuai | Shenzhen Institute of Advanced Technology, Chinese Academy of Sc |
Zhu, Guangxu | Shenzhen Research Institute of Big Data |
Luo, Bin | The University of Hong Kong |
Chen, Yingxian | The University of Hong Kong |
Ng, Derrick Wing Kwan | University of New South Wales |
Wu, Yik-Chung | The University of Hong Kong |
Keywords: Multi-Robot Systems, Learning from Demonstration, Autonomous Vehicle Navigation
Abstract: While federated learning (FL) improves the generalization of end-to-end autonomous driving by model aggregation, the conventional single-hop FL (SFL) suffers from slow convergence rate due to long-range communications among vehicles and cloud server. Hierarchical federated learning (HFL) overcomes such drawbacks via introduction of mid-point edge servers. However, the orchestration between constrained communication resources and HFL performance becomes an urgent problem. This paper proposes an optimization-based Communication Resource Constrained Hierarchical Federated Learning (CRCHFL) framework to minimize the generalization error of the autonomous driving model using hybrid data and model aggregation. The effectiveness of the proposed CRCHFL is evaluated in the Car Learning to Act (CARLA) simulation platform. Results show that the proposed CRCHFL both accelerates the convergence rate and enhances the generalization of federated learning autonomous driving model. Moreover, under the same communication resource budget, it outperforms the HFL by 10.33% and the SFL by 12.44%.
|
|
09:30-09:36, Paper WeAT16.11 | Add to My Program |
Poly-MOT: A Polyhedral Framework for 3D Multi-Object Tracking |
|
Li, Xiaoyu | Harbin Institute of Technology |
Xie, Tao | Harbin Institute of Technology |
Liu, Dedong | Harbin Institute of Technology |
Gao, Jinghan | Harbin Institute of Technology |
Dai, Kun | HIT |
Jiang, Zhiqiang | Harbin Institute of Technology |
Zhao, Lijun | Harbin Institute of Technology |
Wang, Ke | Harbin Institute of Technology |
Keywords: Intelligent Transportation Systems, Computer Vision for Transportation
Abstract: 3D Multi-object tracking (MOT) empowers mobile robots to accomplish well-informed motion planning and navigation tasks by providing motion trajectories of surrounding objects. However, existing 3D MOT methods typically employ a single similarity metric and physical model to perform data association and state estimation for all objects. With large-scale modern datasets and real scenes, there are a variety of object categories that commonly exhibit distinctive geometric properties and motion patterns. In this way, such distinctions would enable various object categories to behave differently under the same standard, resulting in erroneous matches between trajectories and detections, and jeopardizing the reliability of downstream tasks (navigation, etc.). Towards this end, we propose Poly-MOT, an efficient 3D MOT method based on the Tracking-By-Detection framework that enables the tracker to choose the most appropriate tracking criteria for each object category. Specifically, Poly-MOT leverages different motion models for various object categories to characterize distinct types of motion accurately. We also introduce the constraint of the rigid structure of objects into a specific motion model to accurately describe the highly nonlinear motion of the object. Additionally, we introduce a two-stage data association strategy to ensure that objects can find the optimal similarity metric from three custom metrics for their categories and reduce missing matches. On the NuScenes dataset, our proposed method achieves state-of-the-art performance with 75.4% AMOTA. The code is available at https://github.com/lixiaoyu2000/Poly-MOT.
|
|
09:36-09:42, Paper WeAT16.12 | Add to My Program |
SUIT: Learning Significance-Guided Information for 3D Temporal Detection |
|
Zhou, Zheyuan | Fudan University |
Lu, Jiachen | Fudan University |
Zeng, Yihan | Shanghai Jiao Tong University |
Xu, Hang | Noah's Ark Lab |
Zhang, Li | Fudan University |
Keywords: Autonomous Vehicle Navigation, Object Detection, Segmentation and Categorization, Deep Learning for Visual Perception
Abstract: 3D object detection from LiDAR point cloud is of critical importance for autonomous driving and robotics. While sequential point cloud has the potential to enhance 3D perception through temporal information, utilizing these temporal features effectively and efficiently remains a chal- lenging problem. Based on the observation that the foreground information is sparsely distributed in LiDAR scenes, we believe the sufficient knowledge can be provided by sparse format rather than dense maps. To this end, we propose to learn Significance-gUided Information for 3D Temporal detection (SUIT), which simplifies temporal information as sparse fea- tures for information fusion across frames. Specifically, we first introduce a significant sampling mechanism that extracts information-rich yet sparse features based on predicted object centroids. On top of that, we present an explicit geometric transformation learning technique, which learns the object- centric transformations among sparse features across frames. We evaluate our method on large-scale nuScenes and Waymo dataset, where our SUIT not only significantly reduces the memory and computation cost of temporal fusion, but also performs well over the state-of-the-art baselines.
|
|
09:42-09:48, Paper WeAT16.13 | Add to My Program |
Autonomous Navigation System in Pedestrian Scenarios Using a Dreamer-Based Motion Planner |
|
Zhu, Wei | Tohoku University |
Hayashibe, Mitsuhiro | Tohoku University |
Keywords: Reinforcement Learning, Autonomous Vehicle Navigation, Human-Aware Motion Planning
Abstract: Navigation among pedestrians is a crucial capability of service robots; however, it is a challenge to manage time-varying environments stably. Recent deep reinforcement learning (DRL)-based approaches to crowd navigation have yielded numerous promising applications. However, they rely heavily on initial imitation learning and colossal positive datasets. Moreover, the difficulties in accurately localizing robots, detecting and tracking humans, representing and generalizing reciprocal human relationships restrict their deployment in real-world problems. We propose a Dreamer-based motion planner for collision-free navigation in diverse pedestrian scenarios. Our RL framework can completely learn from zero experience via a model-based DRL. The robot and humans are first projected onto a map, which is subsequently decoded into low-dimensional latent state. A predictive dynamic model in the latent space is jointly created to efficiently optimize the navigation policy. Additionally, we leverage the techniques of system identification, domain randomization, clustering and LiDAR SLAM for practical deployment. Simulation ablations and real implementations demonstrate that our motion planner outperforms state-of-the-art methods, and that the navigation system can be physically implemented in the real world.
|
|
WeAT17 Regular session, 330B |
Add to My Program |
Model Learning |
|
|
Chair: Abraham, Ian | Yale University |
Co-Chair: Krovi, Venkat | Clemson University |
|
08:30-08:36, Paper WeAT17.1 | Add to My Program |
Learning Stable Models for Prediction and Control (I) |
|
Mamakoukas, Giorgos | Northwestern University |
Abraham, Ian | Yale University |
Murphey, Todd | Northwestern University |
Keywords: Model Learning for Control, Dynamics, Koopman Operators, Learning and Adaptive Systems
Abstract: This paper demonstrates the benefits of imposing stability on data-driven Koopman operators. The data-driven identification of stable Koopman operators (DISKO) is implemented using an algorithm cite{mamakoukas_stableLDS2020} that computes the nearest textit{stable} matrix solution to a least-squares reconstruction error. As a first result, we derive a formula that describes the prediction error of Koopman representations for an arbitrary number of time steps, and which shows that stability constraints can improve the predictive accuracy over long horizons. As a second result, we determine formal conditions on basis functions of Koopman operators needed to satisfy the stability properties of an underlying nonlinear system. As a third result, we derive formal conditions for constructing Lyapunov functions for nonlinear systems out of stable data-driven Koopman operators, which we use to verify stabilizing control from data. Lastly, we demonstrate the benefits of DISKO in prediction and control with simulations using a pendulum and a quadrotor and experiments with a pusher-slider system. The paper is complemented with a video: url{https://sites.google.com/view/learning-stable-koop
|
|
08:36-08:42, Paper WeAT17.2 | Add to My Program |
Enhancing Sample Efficiency and Uncertainty Compensation in Learning-Based Model Predictive Control for Aerial Robots |
|
Chee, Kong Yao | University of Pennsylvania |
Costa Silva, Thales | University of Pennsylvania |
Hsieh, M. Ani | University of Pennsylvania |
Pappas, George J. | University of Pennsylvania |
Keywords: Machine Learning for Robot Control, Model Learning for Control
Abstract: The recent increase in data availability and reliability has led to a surge in the development of learning-based model predictive control (MPC) frameworks for robot systems. Despite attaining substantial performance improvements over their non-learning counterparts, many of these frameworks rely on an offline learning procedure to synthesize a dynamics model. This implies that uncertainties encountered by the robot during deployment are not accounted for in the learning process. On the other hand, learning-based MPC methods that learn dynamics models online are computationally expensive and often require a significant amount of data. To alleviate these shortcomings, we propose a novel learning-enhanced MPC framework that incorporates components from L1 adaptive control into learning-based MPC. This integration enables the accurate compensation of both matched and unmatched uncertainties in a sample-efficient way, enhancing the control performance during deployment. In our proposed framework, we present two variants and apply them to the control of a quadrotor system. Through simulations and physical experiments, we demonstrate that the proposed framework not only allows the synthesis of an accurate dynamics model on-the-fly, but also significantly improves the closed-loop control performance under a wide range of spatio-temporal uncertainties.
|
|
08:42-08:48, Paper WeAT17.3 | Add to My Program |
Data-Driven Modeling and Experimental Validation of Autonomous Vehicles Using Koopman Operator |
|
Joglekar, Ajinkya | Clemson University |
Sutavani, Sarang | Clemson University |
Samak, Chinmay | Clemson University International Center for Automotive Research |
Samak, Tanmay | Clemson University International Center for Automotive Research |
Kosaraju, Krishna Chaitanya | Clemson University |
Smereka, Jonathon M. | U.S. Army TARDEC |
Gorsich, David | The U.S. Army Ground Vehicle Systems Center |
Vaidya, Umesh | Clemson University |
Krovi, Venkat | Clemson University |
Keywords: Model Learning for Control, Autonomous Vehicle Navigation, Optimization and Optimal Control
Abstract: This paper presents a data-driven framework to discover underlying dynamics on a scaled F1TENTH vehicle using the Koopman operator linear predictor. Traditionally, a range of white, gray, or black-box models are used to develop controllers for vehicle path tracking. However, these models are constrained to either linearized operational domains, unable to handle significant variability or lose explainability through end-2-end operational settings. The Koopman Extended Dynamic Mode Decomposition (EDMD) linear predictor seeks to utilize data-driven model learning whilst providing benefits like explainability, model analysis and the ability to utilize linear model-based control techniques. Consider a trajectory-tracking problem for our scaled vehicle platform. We collect pose measurements of our F1TENTH car undergoing standard vehicle dynamics benchmark maneuvers with an OptiTrack indoor localization system. Utilizing these uniformly spaced temporal snapshots of the states and control inputs, a data-driven Koopman EDMD model is identified. This model serves as a linear predictor for state propagation, upon which an MPC feedback law is designed to enable trajectory tracking. The prediction and control capabilities of our framework are highlighted through real-time deployment on our scaled vehicle.
|
|
08:48-08:54, Paper WeAT17.4 | Add to My Program |
Adaptive Exploration-Exploitation Active Learning of Gaussian Processes |
|
Kontoudis, George | University of Maryland |
Otte, Michael W. | University of Maryland |
Keywords: Machine Learning for Robot Control, Probability and Statistical Methods, Autonomous Agents
Abstract: Active Learning of Gaussian process (GP) surrogates is an efficient way to model unknown environments in various applications. In this paper, we propose an adaptive exploration-exploitation active learning method (ALX) that can be executed rapidly to facilitate real-time decision making. For the exploration phase, we formulate an acquisition function that maximizes the approximated, expected Fisher information. For the exploitation phase, we employ a closed-form acquisition function that maximizes the total expected variance reduction of the search space. The determination of each phase is established with an exploration condition that measures the predictive accuracy of GP surrogates. Extensive numerical experiments in multiple input spaces validate the efficiency of our method.
|
|
08:54-09:00, Paper WeAT17.5 | Add to My Program |
Sample-Efficient Real-Time Planning with Curiosity Cross-Entropy Method and Contrastive Learning |
|
Kotb, Mostafa | Hamburg University |
Weber, Cornelius | Knowledge Technology Group, University of Hamburg |
Wermter, Stefan | University of Hamburg |
Keywords: Model Learning for Control, Reinforcement Learning, Deep Learning Methods
Abstract: Model-based reinforcement learning (MBRL) with real-time planning has shown great potential in locomotion and manipulation control tasks. However, the existing planning methods, such as the Cross-Entropy Method (CEM), do not scale well to complex high-dimensional environments. One of the key reasons for underperformance is the lack of exploration, as these planning methods only aim to maximize the cumulative extrinsic reward over the planning horizon. Furthermore, planning inside the compact latent space in the absence of observations makes it challenging to use curiosity-based intrinsic motivation. We propose Curiosity CEM (CCEM), an improved version of the CEM algorithm for encouraging exploration via curiosity. Our proposed method maximizes the sum of state-action Q values over the planning horizon, in which these Q values estimate the future extrinsic and intrinsic reward, hence encouraging reaching novel observations. In addition, our model uses contrastive representation learning to efficiently learn latent representations. Experiments on image-based continuous control tasks from the DeepMind Control suite show that CCEM is by a large margin more sample-efficient than previous MBRL algorithms and compares favorably with the best model-free RL methods.
|
|
09:00-09:06, Paper WeAT17.6 | Add to My Program |
Underactuated MIMO Airship Control Based on Online Data-Driven Reinforcement Learning |
|
Boase, Derek | University of Ottawa |
Gueaieb, Wail | University of Ottawa |
Miah, Suruz | Bradley University |
Keywords: Machine Learning for Robot Control, Aerial Systems: Mechanics and Control, Reinforcement Learning
Abstract: In this work, a novel online model-free controller for an underactuated dirigible is developed based on reinforcement learning and optimal control theory. A reinforcement learning structure is used while overcoming the dependence of the value function on future values by introducing a neural network that is adapted using input-output data. The suboptimal critic neural network is structured such that optimality is guaranteed over the interval from which the data is valid. The system performance is validated using a highly realistic physics engine, Gazebo, with the robot operating system (ROS) interface and the results are compared to the performance of a model-based controller specifically designed to control the airship model. It is emphasized that the proposed formulation does not leverage any knowledge of vehicle dynamics and thus is considered a vehicle agnostic control strategy.
|
|
09:06-09:12, Paper WeAT17.7 | Add to My Program |
Grasp Stability Assessment through Attention-Guided Cross-Modality Fusion and Transfer Learning |
|
Zhang, Zhuangzhuang | Shanghai Jiao Tong University |
Zhou, Zhenning | Shanghai Jiao Tong University |
Wang, Haili | Shanghai Jiao Tong University |
Zhang, Zhinan | Shanghai Jiao Tong University |
Huang, Huang | Beijing Institute of Control Engineering |
Cao, Qixin | Shanghai Jiao Tong University |
Keywords: Machine Learning for Robot Control
Abstract: research has been conducted on assessing grasp stability, a crucial prerequisite for achieving optimal grasping strategies, including the minimum force grasping policy. However, existing works employ basic feature-level fusion techniques to combine visual and tactile modalities, resulting in the inadequate utilization of complementary information and the inability to model interactions between unimodal features. This work proposes an attention-guided cross-modality fusion architecture to comprehensively integrate visual and tactile features. This model mainly comprises convolutional neural networks (CNNs), self-attention, and cross-attention mechanisms. In addition, most existing methods collect datasets from real-world systems, which is time-consuming and high-cost, and the datasets collected are comparatively limited in size. This work establishes a robotic grasping system through physics simulation to collect a multimodal dataset. To address the sim-to-real transfer gap, we propose a migration strategy encompassing domain randomization and domain adaptation techniques. The experimental results demonstrate that the proposed fusion framework achieves markedly enhanced prediction performance (approximately 10%) compared to other baselines. Moreover, our findings suggest that the trained model can be reliably transferred to real robotic systems, indicating its potential to address real-world challenges.
|
|
09:12-09:18, Paper WeAT17.8 | Add to My Program |
On-Robot Bayesian Reinforcement Learning for POMDPs |
|
Hai, Nguyen | Northeastern University |
Katt, Sammie | Northeastern |
Xiao, Yuchen | Northeastern Univerisity |
Amato, Christopher | Northeastern University |
Keywords: Machine Learning for Robot Control, Learning from Experience, Reinforcement Learning
Abstract: Robot learning is often difficult due to the expense of gathering data. The need for large amounts of data can, and should, be battled with effective algorithms and through the leverage of expert information on robot dynamics. Bayesian reinforcement learning (BRL), thanks to its sample efficiency and ability to exploit prior knowledge, is uniquely positioned as a solution method. Unfortunately, the application of BRL has been limited due to, we argue, the difficulties of representing expert knowledge as well as solving the subsequent inference problem. This document advances BRL for robotics by proposing a specialized framework for physical systems. In particular, we capture this knowledge in a factored representation, then demonstrate the posterior factorizes in a similar shape, and ultimately formalize this in a Bayesian framework. We then introduce a sample-based online solution method, based on Monte-Carlo tree search and particle filtering, specialized to solve the resulting model. This approach can, for example, utilize typical low-level robot simulators and handle uncertainty over unknown dynamics of the environment. We empirically demonstrate its efficiency by performing on-robot learning in two human-robot interaction tasks with uncertainty about human behavior, achieving near-optimal performance after only a handful of real-world episodes.
|
|
09:18-09:24, Paper WeAT17.9 | Add to My Program |
Uncertainty-Aware Model-Based Offline Reinforcement Learning for Automated Driving |
|
Diehl, Christopher | TU Dortmund University |
Sievernich, Timo | TU Dortmund |
Krueger, Martin | TU Dortmund University |
Hoffmann, Frank | Technische Universität Dortmund |
Bertram, Torsten | Technische Universität Dortmund |
Keywords: Machine Learning for Robot Control, Human-Aware Motion Planning, Intelligent Transportation Systems
Abstract: Offline reinforcement learning (RL) provides a framework for learning decision-making from offline data and therefore constitutes a promising approach for real-world applications such as automated driving (AD). Especially in safety-critical applications, interpretability and transferability are crucial to success. That motivates model-based offline RL approaches, which leverage planning. However, current state-of-the-art (SOTA) methods often neglect the influence of aleatoric uncertainty arising from the stochastic behavior of multi-agent systems. Further, while many algorithms state that they are suitable for AD, there is still a lack of evaluation in challenging scenarios. This work proposes a novel approach for Uncertainty-aware Model-Based Offline REinforcement Learning Leveraging pltextbf{A}nning (UMBRELLA), which jointly solves the prediction, planning, and control problem of the self-driving vehicle (SDV) in an interpretable learning-based fashion. A trained action-conditioned stochastic dynamics model captures distinctively different future evolutions of the traffic scene. The analysis provides empirical evidence for the effectiveness of our approach and SOTA performance in challenging AD simulations and using a real-world public dataset.
|
|
09:24-09:30, Paper WeAT17.10 | Add to My Program |
Bayesian Multi-Task Learning MPC for Robotic Mobile Manipulation |
|
Arcari, Elena | ETH Zurich |
Minniti, Maria Vittoria | ETH Zurich |
Scampicchio, Anna | ETH Zurich |
Carron, Andrea | ETH Zurich |
Farshidian, Farbod | ETH Zurich |
Hutter, Marco | ETH Zurich |
Zeilinger, Melanie N. | ETH Zurich |
Keywords: Model Learning for Control, Transfer Learning, Mobile Manipulation
Abstract: Mobile manipulation in robotics is challenging due to the need of solving many diverse tasks, such as opening a door or picking-and-placing an object. Typically, a basic first-principles system description of the robot is available, thus motivating the use of model-based controllers. However, the robot dynamics and its interaction with an object are affected by uncertainty, limiting the controller's performance. To tackle this problem, we propose a Bayesian multi-task learning model that uses trigonometric basis functions to identify the error in the dynamics. In this way, data from different but related tasks can be leveraged to provide a descriptive error model that can be efficiently updated online for new, unseen tasks. We combine this learning scheme with a model predictive controller, and extensively test the effectiveness of the proposed approach, including comparisons with available baseline controllers. We present simulation tests with a ball-balancing robot, and door-opening hardware experiments with a quadrupedal manipulator.
|
|
09:30-09:36, Paper WeAT17.11 | Add to My Program |
Off-Policy Evaluation with Online Adaptation for Robot Exploration in Challenging Environments |
|
Hu, Yafei | Carnegie Mellon University |
Geng, Junyi | Pennsylvania State University |
Wang, Chen | State University of New York at Buffalo |
Keller, John | Carnegie Mellon University |
Scherer, Sebastian | Carnegie Mellon University |
Keywords: Machine Learning for Robot Control, Reinforcement Learning, Deep Learning Methods
Abstract: Autonomous exploration has many important applications. However, classic information gain-based or frontier-based exploration only relies on the robot current state to determine the immediate exploration goal, which lacks the capability of predicting the value of future states and thus leads to inefficient exploration decisions. This paper presents a method to learn how “good” states are, measured by the state value function, to provide a guidance for robot exploration in real-world challenging environments. We formulate our work as an off-policy evaluation (OPE) problem for robot exploration (OPERE). It consists of offline Monte-Carlo training on real-world data and performs Temporal Difference (TD) online adaptation to optimize the trained value estimator. We also design an intrinsic reward function based on sensor information coverage to enable the robot to gain more information with sparse extrinsic rewards. Results show that our method enables the robot to predict the value of future states so as to better guide robot exploration. The proposed algorithm achieves better prediction and exploration performance compared with the state-of-the-arts. To the best of our knowledge, this work for the first time demonstrates value function prediction on real-world dataset for robot exploration in challenging subterranean and urban environments. More details and demo videos can be found at https://jeffreyyh.github.io/opere/.
|
|
09:36-09:42, Paper WeAT17.12 | Add to My Program |
Data-Driven Steering of Concentric Tube Robots in Unknown Environments Via Dynamic Mode Decomposition |
|
Thamo, Balint | University of Edinburgh |
Khadem, Mohsen | University of Edinburgh |
Hanley, David | University of Illinois at Urbana-Champaign |
Dhaliwal, Kev | University of Edinburgh, Center for Inflammation Research, |
Keywords: Model Learning for Control, Learning from Experience, Surgical Robotics: Steerable Catheters/Needles
Abstract: Concentric Tube Robots (CTRs) are a type of continuum robot capable of manipulating objects in restricted spaces and following smooth curvilinear trajectories. CTRs are ideal instruments for minimally invasive surgeries. Accurate control of CTR’s motion in presence of contact with tissue and external forces will allow safe deployment of the robot in variety of minimally invasive surgeries. Here, we propose a data-driven controller that can reliably and precisely direct the robot along predetermined deployment trajectories. In contrast to model-based control approaches, the proposed controller doesn’t rely on a mathematical model of the robot and employs Extended Dynamic Mode Decomposition (EDMD) to learn the nonlinear dynamics of the robot and the interaction forces on the fly. This enables the robot to follow desired trajectories in the presence of unknown perturbations such as external point/distributed forces. A series of experiments are carried out to evaluate the accuracy of the controller in steering the robot on arbitrary trajectories. Results demonstrate that the robot can track arbitrary trajectories with the mean accuracy of less than 2.4 mm in repeated trials. Furthermore, we simulate scenarios where the robot is in contact with a rigid obstacle and is cutting through soft tissue. Based on the results, the proposed controller was capable of reaching various targets with minimum accuracy of 2 mm in the presence of unknown obstacles.
|
|
09:42-09:48, Paper WeAT17.13 | Add to My Program |
ORBIT: A Unified Simulation Framework for Interactive Robot Learning Environments |
|
Mittal, Mayank | ETH Zurich |
Yu, Calvin | University of Toronto |
Yu, Qinxi | University of Toronto |
Liu, Jingzhou | University of Toronto, NVIDIA |
Rudin, Nikita | ETH Zurich, NVIDIA |
Hoeller, David | ETH Zurich, NVIDIA |
Yuan, Jia Lin | University of Toronto |
Singh, Ritvik | NVIDIA |
Guo, Yunrong | NVIDIA |
Mazhar, Hammad | NVIDIA |
Mandlekar, Ajay Uday | NVIDIA |
Babich, Buck | NVIDIA |
State, Gavriel | NVIDIA |
Hutter, Marco | ETH Zurich |
Garg, Animesh | University of Toronto |
Keywords: Software Tools for Benchmarking and Reproducibility, Machine Learning for Robot Control, Simulation and Animation
Abstract: We present ORBIT, a unified and modular framework for robot learning powered by NVIDIA Isaac Sim. It offers a modular design to easily and efficiently create robotic environments with photo-realistic scenes and fast and accurate rigid and deformable body simulation. With ORBIT, we provide a suite of benchmark tasks of varying difficulty– from single-stage cabinet opening and cloth folding to multi-stage tasks such as room reorganization. To support working with diverse observations and action spaces, we include fixed-arm and mobile manipulators with different physically-based sensors and motion generators. ORBIT allows training reinforcement learning policies and collecting large demonstration datasets from hand-crafted or expert solutions in a matter of minutes by leveraging GPU-based parallelization. In summary, we offer an open-sourced framework that readily comes with 16 robotic platforms, 4 sensor modalities, 10 motion generators, more than 20 benchmark tasks, and wrappers to 4 learning libraries. With this framework, we aim to support various research areas, including representation learning, reinforcement learning, imitation learning, and task and motion planning. We hope it helps establish interdisciplinary collaborations in these communities, and its modularity makes it easily extensible for more tasks and applications in the future. For videos, documentation, and code: https://isaac-orbit.github.io/.
|
|
WeAT19 Regular session, 360 Ambassador Ballroom |
Add to My Program |
Deep Learning for Perception III |
|
|
Chair: Skinner, Katherine | University of Michigan |
Co-Chair: Talak, Rajat | MIT |
|
08:30-08:36, Paper WeAT19.1 | Add to My Program |
TemporalStereo: Efficient Spatial-Temporal Stereo Matching Network |
|
Zhang, Youmin | University of Bologna |
Poggi, Matteo | University of Bologna |
Mattoccia, Stefano | University of Bologna |
Keywords: Computer Vision for Automation, Deep Learning Methods, Deep Learning for Visual Perception
Abstract: We present TemporalStereo, a coarse-to-fine stereo matching network that is highly efficient, and able to effectively exploit the past geometry and context information to boost matching accuracy. Our network leverages sparse cost volume and proves to be effective when a single stereo pair is given. However, its peculiar ability to use spatio-temporal information across stereo sequences allows TemporalStereo to alleviate problems such as occlusions and reflective regions while enjoying high efficiency also in this latter case. Notably, our model -- trained once with stereo videos -- can run in both single-pair and temporal modes seamlessly. Experiments show that our network relying on camera motion is robust even to dynamic objects when running on videos. We validate TemporalStereo through extensive experiments on synthetic (SceneFlow, TartanAir) and real (KITTI 2012, KITTI 2015) datasets. Detailed results show that our model achieves state-of-the-art performance on any of these datasets.
|
|
08:36-08:42, Paper WeAT19.2 | Add to My Program |
Convolutional Occupancy Models for Dense Packing of Complex, Novel Objects |
|
Mishra, Nikhil | UC Berkeley, Covariant.ai |
Abbeel, Pieter | UC Berkeley |
Chen, Xi | Embodied Intelligence, UC Berkeley |
Sieb, Maximilian | CovariantAI |
Keywords: Computer Vision for Automation, Deep Learning for Visual Perception
Abstract: Dense packing in pick-and-place systems is an important feature in many warehouse and logistics applications. Prior work in this space has largely focused on planning algorithms in simulation, but real-world packing performance is often bottlenecked by the difficulty of perceiving 3D object geometry in highly occluded, partially observed scenes. In this work, we present a fully-convolutional shape completion model, F-CON, which can be easily combined with off-the-shelf planning methods for dense packing in the real world. We also release a simulated dataset, COB-3D-v2, that can be used to train shape completion models for real-word robotics applications, and use it to demonstrate that F-CON outperforms other state-of-the-art shape completion methods. Finally, we equip a real-world pick-and-place system with F-CON, and demonstrate dense packing of complex, unseen objects in cluttered scenes. Across multiple planning methods, F-CON enables substantially better dense packing than other shape completion methods.
|
|
08:42-08:48, Paper WeAT19.3 | Add to My Program |
Real-Time Video Inpainting for RGB-D Pipeline Reconstruction |
|
Wang, Luyuan | Carnegie Mellon University |
Tian, Yu | Carnegie Mellon University |
Yan, Xinzhi | Carnegie Mellon University |
Ruan, Fujun | Carnegie Mellon University |
Ganapathy Subramanian, Jaya Aadityaa | Carnegie Mellon University |
Choset, Howie | Carnegie Mellon University |
Li, Lu | Carnegie Mellon University |
Keywords: RGB-D Perception, Deep Learning for Visual Perception, Robotics in Hazardous Fields
Abstract: This paper presents a Video Inpainting algorithm that enables monocular-camera-laser-based pipeline inspection robots to capture both color and 3D information using only one video stream. Conventional monocular-camera-laser inspection methods are limited to generating either 2D color images or 3D point clouds because the laser blocks the actual color of the scanning area. We propose a real-time Video Inpainting method to solve this problem with minimal hardware needs that can be easily integrated with conventional pipeline profiling robots. The algorithm is accelerated by two components: (1) a lightweight network that directly predicts the complete optical flow and simplifies the algorithm pipeline, and (2) the Polar coordinate transformation, which significantly reduces the image processing region. Real-world experiments demonstrate that our online algorithm has comparable or better color estimation accuracy with state-of-the-art offline algorithms, while is capable of running at 23 frames per second (FPS) on a laptop computer with a resolution of 1024x1024. In addition, we verify that this method can be used for video pre-processing for downstream tasks that require high-quality visual inputs, such as Simultaneously Localization and Mapping (SLAM). To the best of our knowledge, this is the first real-time Video Inpainting algorithm that can be used for in-pipe environments, serving as an important building block for highly compact RGB-D inspection robots for the pipeline industry.
|
|
08:48-08:54, Paper WeAT19.4 | Add to My Program |
Mufeat: Multi-Level Cnn and Unsupervised Learning for Local Feature Detection and Description |
|
Kuo, Sheng-Hung | National Yang Ming Chiao Tung University |
Wu, Tzu-Han | National Yang Ming Chiao Tung University |
Chen, Zheng Yan | National Yang Ming Chiao Tung University, Department Of. Compute |
Chen, Kuan-Wen | National Yang Ming Chiao Tung University |
Keywords: Visual Learning, Deep Learning for Visual Perception, Deep Learning Methods
Abstract: Local feature detection and description are two essential steps in many visual applications. Most learned local feature methods require high-quality labeled data to achieve superior performance, but such labels are often expensive. To address this problem, we propose MUFeat, an unsupervised learning framework of jointly learning local feature detector and descriptor without requirement of ground-truth correspondences. MUFeat trains the network based on the putative matches from the pretrained model and two proposed unsupervised loss functions. Furthermore, the MUFeat framework includes a pyramidal feature hierarchy network to obtain keypoints and descriptors from feature maps. Experiments indicate that MUFeat outperforms most state-of-the-art supervised learning methods on image matching, medical image registration and visual localization tasks.
|
|
08:54-09:00, Paper WeAT19.5 | Add to My Program |
Early or Late Fusion Matters: Efficient RGB-D Fusion in Vision Transformers for 3D Object Recognition |
|
Tziafas, Georgios | University of Groningen |
Kasaei, Hamidreza | University of Groningen |
Keywords: RGB-D Perception, Recognition, Transfer Learning
Abstract: The Vision Transformer (ViT) architecture has established its place in computer vision literature, however, training ViTs for RGB-D object recognition remains an under-studied topic, viewed in recent literature only through the lens of multi-task pretraining in multiple vision modalities. Such approaches are often computationally intensive, relying on the scale of multiple pretraining datasets to align RGB with 3D information. In this work, we propose a simple yet strong recipe for transferring pretrained ViTs in RGB-D domains for 3D object recognition, focusing on fusing RGB and depth representations encoded jointly by the ViT. Compared to previous works in multimodal Transformers, the key challenge here is to use the attested flexibility of ViTs to capture cross-modal interactions at the downstream and not the pretraining stage. We explore which depth representation is better in terms of resulting accuracy and compare early and late fusion techniques for aligning the RGB and depth modalities within the ViT architecture. Experimental results in the Washington RGB-D Objects dataset (ROD) demonstrate that in such RGB → RGB-D scenarios, late fusion techniques work better than most popularly employed early fusion. With our transfer baseline, fusion ViTs score up to 95.4% top-1 accuracy in ROD, achieving new state-of-the-art results in this benchmark. We further show the benefits of using our multimodal fusion baseline over unimodal feature extractors in a synthetic-to-real visual adaptation as well as in an open-ended lifelong learning scenario in the ROD benchmark, where our model outperforms previous works by a margin of >8%. Finally, we integrate our method with a robot framework and demonstrate how it can serve as a perception utility in an interactive robot learning scenario, both in simulation and with a real robot.
|
|
09:00-09:06, Paper WeAT19.6 | Add to My Program |
TransTouch: Learning Transparent Objects Depth Sensing through Sparse Touches |
|
Bian, Liuyu | Tsinghua University |
Shi, Pengyang | Tsinghua University |
Chen, Weihang | Tsinghua University |
Xu, Jing | Tsinghua University |
Yi, Li | Tsinghua University |
Chen, Rui | Tsinghua University |
Keywords: RGB-D Perception, Deep Learning for Visual Perception, Force and Tactile Sensing
Abstract: Transparent objects are common in daily life. However, depth sensing for transparent objects remains a challenging problem. While learning-based methods can leverage shape priors to improve the sensing quality, the labor-intensive data collection in real world and the sim-to-real domain gap restrict these methods' scalability. In this paper, we propose a method to finetune a stereo network with sparse depth labels automatically collected using a probing system with tactile feedback. We present a novel utility function to evaluate the benefit of touches. By approximating and optimizing the utility function, we can optimize the probing locations given a fixed touching budget to better improve the network's performance on real objects. We further combine tactile depth supervision with a confidence-based regularization to prevent over-fitting during finetuning. To evaluate the effectiveness of our method, we construct a real-world dataset including both diffuse and transparent objects. Experimental results on this dataset show that our method can significantly improve real-world depth sensing accuracy, especially for transparent objects.
|
|
09:06-09:12, Paper WeAT19.7 | Add to My Program |
WatchPed: Pedestrian Crossing Intention Prediction Using Embedded Sensors of Smartwatch |
|
Abbasi, Jibran Ali | Afiniti |
Imran, Navid Mohammad | University of Memphis |
Das, Lokesh Chandra | The University of Memphis |
Won, Myounggyu | University of Memphis |
Keywords: Computer Vision for Transportation, Deep Learning for Visual Perception, Intelligent Transportation Systems
Abstract: The pedestrian crossing intention prediction problem is to estimate whether or not the target pedestrian will cross the street. State-of-the-art techniques heavily depend on visual data acquired through the front camera of the ego-vehicle to make a prediction of the pedestrian's crossing intention. Hence, the efficiency of current methodologies tends to decrease notably in situations where visual input is imprecise, for instance, when the distance between the pedestrian and ego-vehicle is considerable or the illumination levels are inadequate. To address the limitation, in this paper, we present the design, implementation, and evaluation of the first-of-its-kind pedestrian crossing intention prediction model based on integration of motion sensor data gathered through the smartwatch (or smartphone) of the pedestrian. We propose an innovative machine learning framework that effectively integrates motion sensor data with visual input to enhance the predictive accuracy significantly, particularly in scenarios where visual data may be unreliable. Moreover, we perform an extensive data collection process and introduce the first pedestrian intention prediction dataset that features synchronized motion sensor data. The dataset comprises 255 video clips that encompass diverse distances and lighting conditions. We trained our model using the widely-used JAAD and our own datasets and compare the performance with a state-of-the-art model. The results demonstrate that our model outperforms the current state-of-the-art method, particularly in cases where the distance between the pedestrian and the observer is considerable (more than 70 meters) and the lighting conditions are inadequate.
|
|
09:12-09:18, Paper WeAT19.8 | Add to My Program |
Object Detection Based on Raw Bayer Images |
|
Lu, Guoyu | University of Georgia |
Keywords: RGB-D Perception, Range Sensing, Deep Learning for Visual Perception
Abstract: Bayer pattern is a widely used Color Filter Array (CFA) for digital image sensors, efficiently capturing different light wavelengths on different pixels without the need for a costly ISP pipeline. The resulting single-channel raw Bayer images offer benefits such as spectral wavelength sensitivity and low time latency. However, object detection based on Bayer images has been underexplored due to challenges in human observation and algorithm design caused by the discontinuous color channels in adjacent pixels. To address this issue, we propose the BayerDetect network, an end-to-end deep object detection framework that aims to achieve fast, accurate, and memory-efficient object detection. Unlike RGB color images, where each pixel encodes spectral context from adjacent pixels during ISP color interpolation, raw Bayer images lack spectral context. To enhance the spectral context, the BayerDetect network introduces a spectral frequency attention block, transforming the raw Bayer image pattern to the frequency domain. In object detection, clear object boundaries are essential for accurate bounding box predictions. To handle the challenges posed by alternating spectral channels and mitigate the influence of discontinuous boundaries, the BayerDetect network incorporates a spatial attention scheme that utilizes deformable convolutional kernels in multiple scales to explore spatial context effectively. The extracted convolutional features are then passed through a sparse set of proposal boxes for detection and classification. We conducted experiments on both public and self-collected raw Bayer images, and the results demonstrate the superb performance of the BayerDetect network in object detection tasks.
|
|
09:18-09:24, Paper WeAT19.9 | Add to My Program |
SDFMAP: Neural Signed Distance Fields for Mapping and Positioning in Real-Time |
|
Liu, Shanfan | Zhejiang University |
Zhu, Jianke | Zhejiang University |
Keywords: RGB-D Perception, Visual Learning, Deep Learning Methods
Abstract: Neural surface reconstruction has recently gained a bit attention due to the promising result on scene rendering. Nevertheless, most of existing approaches either treat the camera parameters as the prior during training or indirectly estimate them through structure-from-motion. To tap the potential of implicit neural networks, we present a novel end-to-end neural network, termed SDFMAP, without any prior knowledge of the scene, like pre-computed camera parameters and pre-trained geometric priors. Specifically, our method adopts a single multilayer perceptron to achieve simultaneously pose estimation and indoor scene reconstruction in real-time through learning the truncated signed distance function. Comparing to the recent neural implicit vSLAM systems, our approach achieves higher tracking speed via a lightweight network. Experiments on several challenging benchmark datasets show that our SDFMAP method achieves the state-of-the-art results on camera tracking and scene reconstruction.
|
|
09:24-09:30, Paper WeAT19.10 | Add to My Program |
LocalViT: Analyzing Locality in Vision Transformers |
|
Li, Yawei | ETH Zurich |
Zhang, Kai | ETH Zurich |
Cao, Jiezhang | ETH Zurich |
Timofte, Radu | University of Wurzburg |
Magno, Michele | ETH Zurich |
Benini, Luca | University of Bologna |
Van Gool, Luc | ETH Zurich |
Keywords: Recognition, Deep Learning for Visual Perception, Computer Vision for Automation
Abstract: The aim of this paper is to study the influence of locality mechanisms in vision transformers. Transformers originated from machine translation and are particularly good at modelling long-range dependencies within a long sequence. Although the global interaction between the token embeddings could be well modelled by the self-attention mechanism of transformers, what is lacking is a locality mechanism for information exchange within a local region. In this paper, locality mechanism is systematically investigated by carefully designed controlled experiments. We add locality to vision transformers into the feed-forward network. This seemingly simple solution is inspired by the comparison between feed-forward networks and inverted residual blocks. The importance of locality mechanisms is validated in two ways: 1) A wide range of design choices (activation function, layer placement, expansion ratio) are available for incorporating locality mechanisms and proper choices can lead to a performance gain over the baseline, and 2) The same locality mechanism is successfully applied to vision transformers with different architecture designs, which shows the generalization of the locality concept. For ImageNet2012 classification, the locality-enhanced transformers outperform the baselines Swin-T~cite{liu2021swin}, DeiT-T~cite{touvron2020training} and PVT-T~cite{wang2021pyramid} by 1.0%, 2.6% and 3.1% with a negligible increase in the number of parameters and computational effort. Code is available at https://github.com/ofsoundof/LocalViT.
|
|
09:30-09:36, Paper WeAT19.11 | Add to My Program |
CLONeR: Camera-Lidar Fusion for Occupancy Grid-Aided Neural Representations |
|
Carlson, Alexandra | University of Michigan |
Srinivasan Ramanagopal, Manikandasriram | Carnegie Mellon University |
Tseng, Nathan | University of Michigan |
Johnson-Roberson, Matthew | Carnegie Mellon University |
Vasudevan, Ram | University of Michigan |
Skinner, Katherine | University of Michigan |
Keywords: Computer Vision for Transportation, Deep Learning for Visual Perception, Sensor Fusion
Abstract: Recent advances in neural radiance fields (NeRFs) achieve state-of-the-art novel view synthesis and facilitate dense estimation of scene properties. However, NeRFs often fail for outdoor, unbounded scenes that are captured under very sparse views with the scene content concentrated far away from the camera, as is typical for field robotics applications. In particular, NeRF-style algorithms perform poorly: (1) when there are insufficient views with little pose diversity, (2) when scenes contain saturation and shadows, and (3) when finely sampling large unbounded scenes with fine structures becomes computationally intensive. This paper proposes CLONeR, which significantly improves upon NeRF by allowing it to model large unbounded outdoor driving scenes that are observed from sparse input sensor views. This is achieved by decoupling occupancy and color learning within the NeRF framework into separate Multi-Layer Perceptrons (MLPs) trained using LiDAR and camera data, respectively. In addition, this paper proposes a novel method to build differentiable 3D Occupancy Grid Maps (OGM) alongside the NeRF model, and leverage this occupancy grid for improved sampling of points along a ray for volumetric rendering in metric space. Through extensive quantitative and qualitative experiments on scenes from the KITTI dataset, this paper demonstrates that the proposed method outperforms state-of-the-art NeRF models on both novel view synthesis and dense depth prediction tasks when trained on sparse input data.
|
|
09:36-09:42, Paper WeAT19.12 | Add to My Program |
Certifiable Object Pose Estimation: Foundations, Learning Models, and Self-Training (I) |
|
Talak, Rajat | MIT |
Peng, Lisa | Massachusetts Institute of Technology |
Carlone, Luca | Massachusetts Institute of Technology |
Keywords: Object Pose Estimation, Deep Learning in Robotics and Automation, Robot Safety, Visual Learning
Abstract: We consider a certifiable object pose estimation problem, where --given a partial point cloud of an object-- the goal is to not only estimate the object pose, but also to provide a certificate of correctness for the resulting estimate. Our first contribution is a general theory of certification for end-to-end perception models. We introduce the notion of zeta-correctness, and show that it can be assessed by implementing two certificates, namely, a certificate of observable correctness, and a certificate of non-degeneracy. Our second contribution is to apply this theory and design a new learning-based certifiable pose estimator: C-3PO. C-3PO is composed of (i) semantic-keypoint-based pose estimation model, (ii) a keypoint corrector, a differentiable optimization layer, that can correct large detection errors (e.g. due to the sim-to-real gap), and (iii) the two certificates to access correctness of the estimated pose. Our third contribution is a novel self-supervised training approach that uses the certificate of observable correctness to provide the supervisory signal to C-3PO during training. Extensive experimental results show that standard semantic-keypoint-based methods (which constitute the backbone of C-3PO) outperform more recent alternatives in challenging problem instances; that C-3PO further improves performance and significantly outperforms all the baselines; and that C-3PO’s certificates are able to discern correct pose estimates. We release the implementation and an interactive visualization of all the results presented in this paper at: https://github.com/MIT-SPARK/C-3PO and https://github.com/MIT- SPARK/pose- baselines.
|
|
09:42-09:48, Paper WeAT19.13 | Add to My Program |
Perspective Aware Road Obstacle Detection |
|
Lis, Krzysztof | EPFL |
Honari, Sina | EPFL |
Fua, Pascal | EPFL |
Salzmann, Mathieu | EPFL CVLab |
Keywords: Computer Vision for Transportation, Object Detection, Segmentation and Categorization, Deep Learning for Visual Perception
Abstract: While road obstacle detection techniques have become increasingly effective, they typically ignore the fact that, in practice, the apparent size of the obstacles decreases as their distance to the vehicle increases. In this paper, we account for this by computing a scale map encoding the apparent size of a hypothetical object at every image location. We then leverage this perspective map to (i) generate training data by injecting onto the road synthetic objects whose size corresponds to the perspective foreshortening; and (ii) incorporate perspective information in the decoding part of the detection network to guide the obstacle detector. Our results on standard benchmarks show that, together, these two strategies significantly boost the obstacle detection performance, allowing our approach to consistently outperform state-of-the-art methods in terms of instance-level obstacle detection.
|
|
WeAIP Interactive session, Hall E |
|
Poster W1 |
|
|
|
Subsession WeAIP-01, Hall E | |
Clone of 'HRI - Learning' Regular session, 13 papers |
|
Subsession WeAIP-02, Hall E | |
Clone of 'Human and Robot Teaming' Regular session, 14 papers |
|
Subsession WeAIP-03, Hall E | |
Clone of 'Field Robots I' Regular session, 13 papers |
|
Subsession WeAIP-04, Hall E | |
Clone of 'Optimization and Optimal Control I' Regular session, 12 papers |
|
Subsession WeAIP-05, Hall E | |
Clone of 'Manufacturing and Logistics I' Regular session, 13 papers |
|
Subsession WeAIP-06, Hall E | |
Clone of 'Soft Robot Design and Modelling' Regular session, 13 papers |
|
Subsession WeAIP-07, Hall E | |
Clone of 'Surgical Robotics - Control' Regular session, 13 papers |
|
Subsession WeAIP-08, Hall E | |
Clone of 'Humanoid and Bipedal Locomotion' Regular session, 13 papers |
|
Subsession WeAIP-09, Hall E | |
Clone of 'Formal Methods and Planning' Regular session, 13 papers |
|
Subsession WeAIP-10, Hall E | |
Clone of 'Dexterous Manipulation' Regular session, 14 papers |
|
Subsession WeAIP-11, Hall E | |
Clone of 'Swarms' Regular session, 13 papers |
|
Subsession WeAIP-12, Hall E | |
Clone of 'Tactile Sensing I' Regular session, 13 papers |
|
Subsession WeAIP-13, Hall E | |
Clone of 'Vision-Based Navigation III' Regular session, 13 papers |
|
Subsession WeAIP-14, Hall E | |
Clone of 'SLAM I' Regular session, 14 papers |
|
Subsession WeAIP-15, Hall E | |
Clone of 'Multi-Robot and Distributed Robot Systems I' Regular session, 12 papers |
|
Subsession WeAIP-16, Hall E | |
Clone of 'Learning for Navigation' Regular session, 13 papers |
|
Subsession WeAIP-17, Hall E | |
Clone of 'Model Learning' Regular session, 13 papers |
|
Subsession WeAIP-18, Hall E | |
Clone of 'Deep Learning for Perception III' Regular session, 13 papers |
|
10:00-11:30, Subsession WeAIP-20, Hall E | |
Late Breaking Posters V Late breaking, 33 papers |
|
WeAIP-01 Regular session, Hall E |
Add to My Program |
Clone of 'HRI - Learning' |
|
|
|
10:00-11:30, Paper WeAIP-01.1 | Add to My Program |
Primitive Skill-Based Robot Learning from Human Evaluative Feedback |
|
Hiranaka, Ayano | Stanford University |
Hwang, Minjune | Stanford University |
Lee, Sharon | Stanford University |
Wang, Chen | Stanford University |
Fei-Fei, Li | Stanford University |
Wu, Jiajun | Stanford University |
Zhang, Ruohan | Stanford University |
Keywords: Human Factors and Human-in-the-Loop, Reinforcement Learning, Machine Learning for Robot Control
Abstract: Reinforcement learning (RL) algorithms face sig- nificant challenges when dealing with long-horizon robot ma- nipulation tasks in real-world environments due to sample inefficiency and safety issues. To overcome these challenges, we propose a novel framework, SEED, which leverages two approaches: reinforcement learning from human feedback (RLHF) and primitive skill-based reinforcement learning. Both approaches are particularly effective in addressing sparse re- ward issues and the complexities involved in long-horizon tasks. By combining them, SEED reduces the human effort required in RLHF and increases safety in training robot manipulation with RL in real-world settings. Additionally, parameterized skills provide a clear view of the agent’s high-level intentions, allowing humans to evaluate skill choices before they are executed. This feature makes the training process even safer and more efficient. To evaluate the performance of SEED, we conducted extensive experiments on five manipulation tasks with varying levels of complexity. Our results show that SEED significantly outperforms state-of-the-art RL algorithms in sample efficiency and safety. In addition, SEED also exhibits a substantial reduction of human effort compared to other RLHF methods. Further details and video results can be found at https: //seediros23.github.io/.
|
|
10:00-11:30, Paper WeAIP-01.2 | Add to My Program |
Autocomplete of 3D Motions for UAV Teleoperation |
|
Ibrahim, Batool | American University of Beirut AUB |
Haj Hussein, Mohammad | American University of Beirut |
Elhajj, Imad | American University of Beirut |
Asmar, Daniel | American University of Beirut |
Keywords: Human-Robot Collaboration, Deep Learning Methods
Abstract: Tele-operating aerial vehicles without any automated assistance is challenging due to various limitations, especially for inexperienced users. Autocomplete addresses this problem by automatically identifying and completing the user's intended motion. Such a framework uses machine learning to recognize and classify human inputs as one of a set of motion primitives, and then, if the human operator accepts, synthesizes the motion in order to complete the desired motion. This has been shown to improve the performance of the system and reduce operator workload. Previous Autocomplete systems focused on different 2D motions (line, arc, sine,..). However, since most UAVs tasks are in a 3D world, this paper introduces 3D Autocomplete for 3D motions. Moreover, the proposed framework presents just-in-time prediction of the 3D motions by proposing a change point detection technique, which allows the framework to autonomously identify when to conduct a prediction. Also, it deals with variable motion sizes. Real time simulation results show that the proposed framework is capable of predicting the user intentions after change point detection.
|
|
10:00-11:30, Paper WeAIP-01.3 | Add to My Program |
Exploiting Spatio-Temporal Human-Object Relations Using Graph Neural Networks for Human Action Recognition and 3D Motion Forecasting |
|
Lagamtzis, Dimitrios | Esslingen University of Applied Sciences |
Schmidt, Fabian | Esslingen University of Applied Sciences |
Seyler, Jan Reinke | Festo SE & Co. KG |
Dang, Thao | Daimler AG |
Schober, Steffen | Esslingen University |
Keywords: Human-Robot Collaboration, Intention Recognition, Industrial Robots
Abstract: Human action recognition and motion forecasting is becoming increasingly successful, in particular with utilizing graphs. We aim to transfer this success into the context of industrial Human-Robot Collaboration (HRC), where humans work closely with robots and interact with workpieces in defined workspaces. For this purpose, it is necessary to use all the available information extractable in such a workspace and represent it with a natural structure, such as graphs, that can be used for learning. Since humans are the center of HRC, it is mandatory to construct the graph in a human-centered way and use real-world 3D information as well as object labels to represent their environment. Therefore, we present a novel Graph Neural Network (GNN) architecture which combines, human action recognition and motion forecasting for industrial HRC environments. We evaluate our method with two different and publicly available human action datasets, including one that is a particularly realistic representation of the industrial HRC, and compare the results with baseline methods for classifying the current human action and predicting the human motion. Our experiments show that our combined GNN approach improves the accuracy of action recognition compared to previous work, and significantly on the CoAx dataset by up to 20%. Further, our motion forecasting approach performs better than existing baselines, predicting human trajectories with a Final Displacement Error (FDE) of less than 10cm for a prediction horizon of 1s.
|
|
10:00-11:30, Paper WeAIP-01.4 | Add to My Program |
Improving Human-Robot Interaction Effectiveness in Human-Robot Collaborative Object Transportation Using Force Prediction |
|
Dominguez-Vidal, Jose Enrique | Institut De Robňtica I Informŕtica Industrial, CSIC-UPC |
Sanfeliu, Alberto | Universitat Politčcnica De Cataluyna |
Keywords: Human-Robot Collaboration, Physical Human-Robot Interaction, Deep Learning Methods
Abstract: In this work, we analyse the use of a prediction of the human's force in a Human-Robot collaborative object transportation task at a middle distance. We check that this force prediction can improve multiple parameters associated with effective Human-Robot Interaction (HRI) such as perception of the robot's contribution to the task, comfort or trust in the robot in a physical Human Robot Interaction (pHRI). We present a Deep Learning model that allows to predict the force that a human will exert in the next 1 s using as inputs the force previously exerted by the human, the robot's velocity and environment information obtained from the robot's LiDAR. Its success rate is up to 92.3% in testset and up to 89.1% in real experiments. We demonstrate that this force prediction, in addition to being able to be used directly to detect changes in the human's intention, can be processed to obtain an estimate of the human's desired trajectory. We have validated this approach with a user study involving 18 volunteers.
|
|
10:00-11:30, Paper WeAIP-01.5 | Add to My Program |
Leveraging Saliency-Aware Gaze Heatmaps for Multiperspective Teaching of Unknown Objects |
|
Weber, Daniel | University of Tübingen |
Bolz, Valentin | University of Tübingen |
Zell, Andreas | University of Tübingen |
Kasneci, Enkelejda | University of Tübingen |
Keywords: Human Factors and Human-in-the-Loop, Human-Robot Collaboration, Human-Centered Robotics
Abstract: As robots become increasingly prevalent amidst diverse environments, their ability to adapt to novel scenarios and objects is essential. Advances in modern object detection have also paved the way for robots to identify interaction entities within their immediate vicinity. One drawback is that the robot's operational domain must be known at the time of training, which hinders the robot's ability to adapt to unexpected environments outside the preselected classes. However, when encountering such challenges a human can provide support to a robot by teaching it about the new, yet unknown objects on an ad hoc basis. In this work, we merge augmented reality and human gaze in the context of multimodal human-robot interaction to compose saliency-aware gaze heatmaps leveraged by a robot to learn emerging objects of interest. Our results show that our proposed method exceeds the capabilities of the current state of the art and outperforms it in terms of commonly used object detection metrics.
|
|
10:00-11:30, Paper WeAIP-01.6 | Add to My Program |
Language Guided Temporally Adaptive Perception for Efficient Natural Language Grounding in Cluttered Dynamic Worlds |
|
Patki, Siddharth | University of Rochester |
Arkin, Jacob | Massachusetts Institute of Technology |
Raicevic, Nikola | University of Rochester |
Howard, Thomas | University of Rochester |
Keywords: Multi-Modal Perception for HRI, Human-Robot Collaboration, Natural Dialog for HRI
Abstract: As robots operate alongside humans in shared spaces, such as homes and offices, it is essential to have an effective mechanism for interacting with them. Natural language offers an intuitive interface for communicating with robots, but most of the recent approaches to grounded language understanding reason only in the context of an instantaneous state of the world. Though this allows for interpreting a variety of utterances in the current context of the world, these models fail to interpret utterances which require the knowledge of past dynamics of the world, thereby hindering effective human-robot collaboration in dynamic environments. Constructing a comprehensive model of the world that tracks the dynamics of all objects in the robot's workspace is computationally expensive and difficult to scale with increasingly complex environments. To address this challenge, we propose a learned model of language and perception that facilitates the construction of temporally compact models of dynamic worlds through closed-loop grounding and perception. Our experimental results on the task of grounding referring expressions demonstrate more accurate interpretation of robot instructions in cluttered and dynamic table-top environments without a significant increase in runtime as compared to an open-loop baseline.
|
|
10:00-11:30, Paper WeAIP-01.7 | Add to My Program |
T-Top, an Open Source Tabletop Robot with Advanced Onboard Audio, Vision and Deep Learning Capabilities |
|
Maheux, Marc-Antoine | Université De Sherbrooke |
Panchea, Adina Marlena | Université De Sherbrooke |
Warren, Philippe | Université De Sherbrooke |
Létourneau, Dominic | Université De Sherbrooke |
Michaud, Francois | Universite De Sherbrooke |
Keywords: Robot Companions, Deep Learning Methods, Multi-Modal Perception for HRI
Abstract: In recent years, studies on Socially Assistive Robots (SARs) examine how to improve the quality of life of people living with dementia and older adults (OAs) in general. However, most SARs have somewhat limited perception capabilities or interact using simple pre-programmed responses, providing limited or repetitive interaction modalities. Integrating more advanced perceptual capabilities with deep learning processing would help move beyond such limitations. This paper presents T-Top, a tabletop robot designed with advanced audio and vision processing using deep learning neural networks. T-Top is made available as an open source platform with the goal of providing an experimental SAR platform that can implement richer interaction modalities with OAs.
|
|
10:00-11:30, Paper WeAIP-01.8 | Add to My Program |
Learning Human Motion Intention for pHRI Assistive Control |
|
Franceschi, Paolo | CNR-STIIMA |
Bertini, Fabio | Politecnico Di Milano |
Braghin, Francesco | Politecnico Di Milano |
Roveda, Loris | SUPSI-IDSIA |
Pedrocchi, Nicola | National Research Council of Italy (CNR) |
Beschi, Manuel | University of Brescia |
Keywords: Machine Learning for Robot Control, Physical Human-Robot Interaction, Intention Recognition
Abstract: This work addresses human intention identifica- tion during physical Human-Robot Interaction (pHRI) tasks to include this information in an assistive controller. To this purpose, human intention is defined as the desired trajectory that the human wants to follow over a finite rolling prediction horizon so that the robot can assist in pursuing it. This work investigates a Recurrent Neural Network (RNN), specifically, Long-Short Term Memory (LSTM) cascaded with a Fully Connected layer. In particular, we propose an iterative training procedure to adapt the model. Such an iterative procedure is powerful in reducing the prediction error. Still, it has the drawback that it is time-consuming and does not generalize to different users or different co-manipulated objects. To overcome this issue, Transfer Learning (TL) adapts the pre-trained model to new trajectories, users, and co-manipulated objects by freezing the LSTM layer and fine-tuning the last FC layer, which makes the procedure faster. Experiments show that the iterative procedure adapts the model and reduces prediction error. Experiments also show that TL adapts to different users and to the co-manipulation of a large object. Finally, to check the utility of adopting the proposed method, We compare the proposed controller enhanced by the intention prediction with the other two standard controllers of pHRI.
|
|
10:00-11:30, Paper WeAIP-01.9 | Add to My Program |
VARIQuery: VAE Segment-Based Active Learning for Query Selection in Preference-Based Reinforcement Learning |
|
Marta, Daniel | KTH Royal Institute of Technology |
Holk, Simon | KTH Royal Institute of Technology |
Pek, Christian | Delft University of Technology |
Tumova, Jana | KTH Royal Institute of Technology |
Leite, Iolanda | KTH Royal Institute of Technology |
Keywords: Human Factors and Human-in-the-Loop, Reinforcement Learning, Representation Learning
Abstract: Human-in-the-loop reinforcement learning (RL) methods actively integrate human knowledge to create reward functions for various robotic tasks. Learning from preferences shows promise as alleviates the requirement of demonstrations by querying humans on state-action sequences. However, the limited granularity of sequence-based approaches complicates temporal credit assignment. The amount of human querying is contingent on query quality, as redundant queries result in excessive human involvement. This paper addresses the often-overlooked aspect of query selection, which is closely related to active learning (AL). We propose a novel query selection approach that leverages variational autoencoder (VAE) representations of state sequences. In this manner, we formulate queries that are diverse in nature while simultaneously taking into account reward model estimations. We compare our approach to the current state-of-the-art query selection methods in preference-based RL, and find ours to be either on-par or more sample efficient through extensive benchmarking on simulated environments relevant to robotics. Lastly, we conduct an online study to verify the effectiveness of our query selection approach with real human feedback and examine several metrics related to human effort.
|
|
10:00-11:30, Paper WeAIP-01.10 | Add to My Program |
Interactive Spatiotemporal Token Attention Network for Skeleton-Based General Interactive Action Recognition |
|
Wen, Yuhang | Sun Yat-Sen University |
Tang, Zixuan | Sun Yat-Sen University |
Pang, Yunsheng | Tencent Technology (Shenzhen) Co., Ltd., China |
Ding, Beichen | Sun Yat-Sen University |
Liu, Mengyuan | Sun Yat-Sen University |
Keywords: Human and Humanoid Motion Analysis and Synthesis, Deep Learning for Visual Perception, Human-Robot Collaboration
Abstract: Recognizing interactive action plays an important role in human-robot interaction and collaboration. Previous methods use late fusion and co-attention mechanism to capture interactive relations, which have limited learning capability or inefficiency to adapt to more interacting entities. With assumption that priors of each entity are already known, they also lack evaluations on a more general setting addressing the diversity of subjects. To address these problems, we propose an Interactive Spatiotemporal Token Attention Network (ISTA-Net), which simultaneously model spatial, temporal, and interactive relations. Specifically, our network contains a tokenizer to partition Interactive Spatiotemporal Tokens (ISTs), which is a unified way to represent motions of multiple diverse entities. By extending the entity dimension, ISTs provide better interactive representations. To jointly learn along three dimensions in ISTs, multi-head self-attention blocks integrated with 3D convolutions are designed to capture inter-token correlations. When modeling correlations, a strict entity ordering is usually irrelevant for recognizing interactive actions. To this end, Entity Rearrangement is proposed to eliminate the orderliness in ISTs for interchangeable entities. Extensive experiments on four datasets verify the effectiveness of ISTA-Net by outperforming state-of-the-art methods. Our code is publicly available at https://github.com/Necolizer/ISTA-Net.
|
|
10:00-11:30, Paper WeAIP-01.11 | Add to My Program |
Learning Joint Policies for Human-Robot Dialog and Co-Navigation |
|
Hayamizu, Yohei | SUNY Binghamton |
Yu, Zhou | Columbia University |
Zhang, Shiqi | SUNY Binghamton |
Keywords: Natural Dialog for HRI, Reinforcement Learning, Service Robotics
Abstract: Service robots need language capabilities for communicating with people, and navigation skills for beyond-proximity interaction in the real world. When the robot explores the real world with people side by side, there is the compound problem of human-robot dialog and co-navigation. The human-robot team uses dialog to decide where to go, and their shared spatial awareness affects the dialog state. In this paper, we develop a framework that learns a joint policy for human-robot dialog and co-navigation toward efficiently and accurately completing tour guide and information delivery tasks. We show that our approach outperforms baselines from the literature in task completion rate and execution time, and demonstrate our approach in the real world.
|
|
10:00-11:30, Paper WeAIP-01.12 | Add to My Program |
Natural Language Specification of Reinforcement Learning Policies through Differentiable Decision Trees |
|
Tambwekar, Pradyumna | Georgia Institute of Technology |
Silva, Andrew | Georgia Institute of Technology |
Gopalan, Nakul | Arizona State University |
Gombolay, Matthew | Georgia Institute of Technology |
Keywords: Human-Centered Automation, Human-Centered Robotics
Abstract: Human-AI policy specification is a novel procedure we define in which humans can collaboratively warm-start a robot's reinforcement learning policy. This procedure is comprised of two steps; (1) Policy Specification, i.e. humans specifying the behavior they would like their companion robot to accomplish, and (2) Policy Optimization, i.e. the robot applying reinforcement learning to improve the initial policy. Existing approaches to enabling collaborative policy specification are often unintelligible black-box methods, and are not catered towards making the autonomous system accessible to a novice end-user. In this paper, we develop a novel collaborative framework to enable humans to initialize and interpret an autonomous agent's behavior. Through our framework, we enable humans to specify an initial behavior model via unstructured, natural language, which we convert to lexical decision trees. Next, we leverage these translated human-specifications, to warm-start reinforcement learning and allow the agent to further optimize these potentially suboptimal policies. Our approach warm-starts an RL agent by utilizing non-expert natural language specifications without incurring the additional domain exploration costs. We validate our approach by showing that our model is able to produce >80% translation accuracy, and that policies initialized by a human are able match the performance of relevant RL baselines in two differing domains.
|
|
10:00-11:30, Paper WeAIP-01.13 | Add to My Program |
Robots Autonomously Detecting People: A Multimodal Deep Contrastive Learning Method Robust to Intraclass Variations |
|
Fung, Angus | University of Toronto |
Benhabib, Beno | University of Toronto |
Nejat, Goldie | University of Toronto |
Keywords: Human-Centered Robotics, Deep Learning for Visual Perception, Human Detection and Tracking
Abstract: Robotic detection of people in crowded and/or cluttered human-centered environments including hospitals, stores and airports is challenging as people can become occluded by other people or objects, and deform due to clothing or pose variations. There can also be loss of discriminative visual features due to poor lighting. In this paper, we present a novel multimodal person detection architecture to address the mobile robot problem of person detection under intraclass variations. We present a two-stage training approach using: 1) a unique pretraining method we define as Temporal Invariant Multimodal Contrastive Learning (TimCLR), and 2) a Multimodal YOLOv4 (MYOLOv4) detector for finetuning. TimCLR learns person representations that are invariant under intraclass variations through unsupervised learning. Our approach is unique in that it generates image pairs from natural variations within multimodal image sequences and contrasts crossmodal features to transfer invariances between different modalities. These pretrained features are used by the MYOLOv4 detector for finetuning and person detection from RGB-D images. Extensive experiments validate the performance of our DL architecture in both human-centered crowded and cluttered environments. Results show that our method outperforms existing unimodal and multimodal person detection approaches in detection accuracy when considering body occlusions and pose deformations in different lighting.
|
|
WeAIP-02 Regular session, Hall E |
Add to My Program |
Clone of 'Human and Robot Teaming' |
|
|
|
10:00-11:30, Paper WeAIP-02.1 | Add to My Program |
Initial Task Allocation for Multi-Human Multi-Robot Teams with Attention-Based Deep Reinforcement Learning |
|
Wang, Ruiqi | Purdue University |
Zhao, Dezhong | Beijing University of Chemical Technology |
Min, Byung-Cheol | Purdue University |
Keywords: Human-Robot Teaming, Task Planning, Human-Robot Collaboration
Abstract: Multi-human multi-robot teams have great potential for complex and large-scale tasks through the collaboration of humans and robots with diverse capabilities and expertise. To efficiently operate such highly heterogeneous teams and maximize team performance timely, sophisticated initial task allocation strategies that consider individual differences across team members and tasks are required. While existing works have shown promising results in reallocating tasks based on agent state and performance, the neglect of the inherent heterogeneity of the team hinders their effectiveness in realistic scenarios. In this paper, we present a novel formulation of the initial task allocation problem in multi-human multi-robot teams as contextual multi-attribute decision-make process and propose an attention-based deep reinforcement learning approach. We introduce a cross-attribute attention module to encode the latent and complex dependencies of multiple attributes in the state representation. We conduct a case study in a massive threat surveillance scenario and demonstrate the strengths of our model.
|
|
10:00-11:30, Paper WeAIP-02.2 | Add to My Program |
Human-Robot Collaboration for Unknown Flexible Surface Exploration and Treatment Based on Mesh Iterative Learning Control |
|
Xia, Jingkang | Southwest Jiaotong University, School of Electrical Engineering |
Dickwella Widanage, Kithmi Nima | University of Sussex |
Zhang, Ruiqing | Southwest Jiaotong University |
Parween, Rizuwana | University of Sussex |
Godaba, Hareesh | University of Sussex |
Herzig, Nicolas | University of Sussex |
Glovnea, Romeo | University of Sussex |
Huang, Deqing | Southwest Jiaotong University |
Li, Yanan | University of Sussex |
Keywords: Human-Robot Collaboration, Model Learning for Control, Robust/Adaptive Control
Abstract: Contact tooling operations like sanding and polishing have been high in demand for robotics and automation, as manual operations are labour-intensive with inconsistent quality. However, automating these operations remains a challenge since they are highly dependent on prior knowledge about the geometry of the workpiece. While several methods have been developed in existing research to automate the geometry learning process and adjust the contact force, human supervision is heavily required in the calibration of workpieces and the path planning of robot motion in such methods. Furthermore, the stiffness identification of the workpiece is not considered in most of these methods. This paper presents a human-robot collaboration (HRC) framework, which is able to perform surface exploration on an unknown object combining the operator's flexibility with the control precision of the robot. The operator moves the robot along the surface of the target object, and the robot recognizes the surface geometry and surface stiffness while exerting a desired contact force through control. For this purpose, a mesh iterative learning control (MILC) is developed to learn the surface stiffness, plan the exploration path, and adjust contact force through repetitive online correction based on HRC. The proof of learning convergence and the results of the simulation and experiments on a 7-DOF Sawyer robot platform illustrate the validity of the proposed method.
|
|
10:00-11:30, Paper WeAIP-02.3 | Add to My Program |
Projecting Robot Intentions through Visual Cues: Static vs. Dynamic Signaling |
|
Sonawani, Shubham | Arizona State University |
Zhou, Yifan | Arizona State University |
Ben Amor, Heni | Arizona State University |
Keywords: Virtual Reality and Interfaces, Human-Robot Collaboration, Human-Robot Teaming
Abstract: Augmented and mixed-reality techniques harbor a great potential for improving human-robot collaboration. Visual signals and cues may be projected to a human partner in order to explicitly communicate robot intentions and goals.However, it is unclear what type of signals support such a process and whether signals can be combined without adding additional cognitive stress to the partner. This paper focuses on identifying the effective types of visual signals and quantify their impact through empirical evaluations. In particular, the study compares static and dynamic visual signals within a collaborative object sorting task and assesses their ability to shape human behavior. Furthermore, an information-theoretic analysis is performed to numerically quantify the degree of information transfer between visual signals and human behavior. The results of a human subject experiment show that there are significant advantages to combining multiple visual signals within a single task, i.e., increased task efficiency and reduced cognitive load.
|
|
10:00-11:30, Paper WeAIP-02.4 | Add to My Program |
Robottheory Fitness: GoBot’s Engagement Edge for Spurring Physical Activity in Young Children |
|
Morales Mayoral, Rafael | Oregon State University |
Helmi, Ameer | Oregon State University |
Warren, Shel-Twon | University of Arkansas |
Logan, Samuel W. | Oregon State University |
Fitter, Naomi T. | Oregon State University |
Keywords: Robot Companions, Long term Interaction, Social HRI
Abstract: Children around the world are growing more sedentary over time, which leads to considerable accompanying wellness challenges. Pilot results from our research group have shown that robots may offer something different or better than other developmentally appropriate toys when it comes to motivating physical activity. However, the foundations of this work involved larger-group interactions in which it was difficult to tease apart potential causes of motion, or one-time sessions during which the impact of the robot may have been due to novelty. Accordingly, the work in this paper covers more controlled interactions focused on one robot and one child participant, in addition to considering interactions over longitudinal observation. We discuss the results of a deployment during which N = 8 participants interacted with our custom GoBot robot over two months of weekly sessions. Within each session, the child users experienced a teleoperated robot mode, a semi-autonomous robot mode, and a control condition during which the robot was present but inactive. Results showed that children tended to be more active when the robot was active and the teleoperated mode did not yield significantly different results than the semi-autonomous mode. These insights can guide future application of assistive robots in child motor interventions, in addition to informing how these robots can be equipped to assist busy human clinicians.
|
|
10:00-11:30, Paper WeAIP-02.5 | Add to My Program |
Implicit Projection: Improving Team Situation Awareness for Tacit Human-Robot Interaction Via Virtual Shadows |
|
Boateng, Andrew | Arizona State University |
Zhang, Wenlong | Arizona State University |
Zhang, Yu (Tony) | Arizona State University |
Keywords: Virtual Reality and Interfaces, Human-Robot Teaming, Design and Human Factors
Abstract: Fluent teaming is characterized by tacit interaction without explicit communication. Such interaction requires team situation awareness (TSA) to facilitate. However, existing approaches often rely on explicit communication (such as visual projection) to support TSA, resulting in a paradox. In this paper, we consider implicit projection (IP) to improve TSA for tacit human-robot interaction. IP minimizes interruption and can thus reduce the cognitive demand to maintain TSA in teaming. We introduce a novel process for achieving IP via virtual shadows (referred to as IPS). We compare our method with two baselines that use explicit projection to maintain TSA. Results via human factors studies demonstrate that IPS supports better TSA and significantly improves unsolicited human responsiveness to robots, a key feature of fluent teaming. Participants acknowledged robots implementing IPS more favorable as a teammate. Simultaneously, our results also demonstrate that IPS is comparable to, and sometimes better than, the best-performing baselines on information accuracy.
|
|
10:00-11:30, Paper WeAIP-02.6 | Add to My Program |
User Interactions and Negative Examples to Improve the Learning of Semantic Rules in a Cognitive Exercise Scenario |
|
Suárez-Hernández, Alejandro | CSIC-UPC |
Andriella, Antonio | Pal Robotics |
Torras, Carme | Csic - Upc |
Alenyŕ, Guillem | CSIC-UPC |
Keywords: Human-Robot Collaboration, Human-Robot Teaming, Social HRI
Abstract: Enabling a robot to perform new tasks is a complex endeavor, usually beyond the reach of non-technical users. For this reason, research efforts that aim at empowering end-users to teach robots new abilities using intuitive modes of interaction are valuable. In this article, we present INtuitive PROgramming 2 (INPRO2), a learning framework that allows inferring planning actions from demonstrations given by a human teacher. INPRO2 operates in an assistive scenario, in which the robot may learn from a healthcare professional (a therapist or caregiver) new cognitive exercises that can be later administered to patients with cognitive impairment. INPRO2 features significant improvements over previous work, namely: (1) exploitation of negative examples; (2) proactive interaction with the teacher to ask questions about the legality of certain movements; and (3) learning goals in addition to legal actions. Through simulations, we show the performance of different proactive strategies for gathering negative examples. Real-world experiments with human teachers and a TIAGo robot are also presented to qualitatively illustrate INPRO2.
|
|
10:00-11:30, Paper WeAIP-02.7 | Add to My Program |
Large Language Models As Zero-Shot Human Models for Human-Robot Interaction |
|
Zhang, Bowen | National University of Singapore |
Soh, Harold | National University of Singapore |
Keywords: Human-Robot Collaboration, Human-Centered Robotics, Cognitive Modeling
Abstract: Human models play a crucial role in human-robot interaction (HRI), enabling robots to consider the impact of their actions on humans and plan their behavior accordingly. However, crafting good human models is challenging; capturing context-dependent human behavior requires significant prior knowledge and/or large amounts of interaction data, both of which are difficult to obtain. In this work, we explore the potential of large-language models (LLMs) --- which have consumed vast amounts of human-generated text data --- to act as zero-shot human models for HRI. Our experiments on three social datasets yield promising results; the LLMs are able to achieve performance comparable to purpose-built models. That said, we also discuss current limitations, such as sensitivity to prompts and spatial/numerical reasoning mishaps. Based on our findings, we demonstrate how LLM-based human models can be integrated into a social robot's planning process and applied in HRI scenarios. Specifically, we present one case study on a simulated trust-based table-clearing task and replicate past results that relied on custom models. Next, we conduct a new robot utensil-passing experiment (n = 65) where preliminary results show that planning with a LLM-based human model can achieve gains over a basic myopic plan. In summary, our results show that LLMs offer a promising (but incomplete) approach to human modeling for HRI.
|
|
10:00-11:30, Paper WeAIP-02.8 | Add to My Program |
MPC-Based Human-Accompanying Control Strategy for Improving the Motion Coordination between the Target Person and the Robot |
|
Peng, Jianwei | University of Chinese Academy of Sciences |
Liao, Zhelin | Fujian Agriculture and Forestry University |
Yao, Hanchen | Fujian Institute of Research on the Structure of Matter, Chinese |
Su, Zefan | Fuzhou University |
Zeng, Yadan | Nanyang Technology University |
Dai, Houde | Haixi Institutes, Chinese Academy of Sciences |
Keywords: Human-Robot Collaboration, Service Robotics, Human-Robot Teaming
Abstract: Social robots have gained widespread attention for their potential to assist people in diverse domains, such as living assistance and logistics transportation. Human-accompanying, i.e., walking side-by-side with a person, is an expected and essential capability for social robots. However, due to the complexity of motion coordination between the target person and the mobile robot, the accompanying action is still unstable. In this study, we propose a human-accompanying control strategy to improve the motion coordination for better practicability of the human-accompanying robot. Our approach allows the robot to adapt to the motion variations of the target person and avoid obstacles while accompanying them. First, a human-robot interaction model based on the separation-bearing-orientation scheme is developed to ascertain the relative position and orientation between the robot and the target person. Then, a human-accompanying controller based on behavioral dynamics and model predictive control (MPC) is designed to avoid obstacles and simultaneously track the direction and velocity of the target person. Experimental results indicate that the proposed method can effectively achieve side-by-side accompanying by simultaneously controlling the relative position, direction, and velocity between the target person and robot.
|
|
10:00-11:30, Paper WeAIP-02.9 | Add to My Program |
Improved Inference of Human Intent by Combining Plan Recognition and Language Feedback |
|
Idrees, Ifrah | Brown University |
Yun, Tian | Brown University |
Sharma, Naveen | Humans to Robots Lab at Brown University |
Deng, Yunxin | Brown University |
Gopalan, Nakul | Arizona State University |
Tellex, Stefanie | Brown |
Konidaris, George | Brown University |
Keywords: Human-Robot Collaboration, Intention Recognition, Human-Centered Robotics
Abstract: Conversational assistive robots can aid people, especially those with cognitive impairments, to accomplish various tasks such as cooking meals, performing exercises, or operating machines. However, to interact with people effectively, robots must recognize human plans and goals from noisy observations of human actions, even when the user acts sub-optimally. Previous works on Plan and Goal Recognition (PGR) as planning have used hierarchical task networks (HTN) to model the actor/human. However, these techniques are insufficient as they do not have user engagement via natural modes of interaction such as language. Moreover, they have no mechanisms to let users, especially those with cognitive impairments, know of a deviation from their original plan or about any sub-optimal actions taken towards their goal. We propose a novel framework for plan and goal recognition in partially observable domains—Dialogue for Goal Recognition (D4GR) enabling a robot to rectify its belief in human progress by asking clarification questions about noisy sensor data and sub-optimal human actions. We evaluate the performance of D4GR over two simulated domains—kitchen and blocks domain. With language feedback and the world state information in a hierarchical task model, we show that D4GR framework for the highest sensor noise performs 1% better than HTN in goal accuracy in both domains. For plan accuracy, D4GR outperforms by 4% in the kitchen domain and 2% in the blocks domain in comparison to HTN. The ALWAYS-ASK oracle outperforms our policy by 3% in goal recognition and 7% in plan recognition. D4GR does so by asking 68% fewer questions than an oracle baseline. We also demonstrate a real-world robot scenario in the kitchen domain, validating the improved plan and goal recognition of D4GR in a realistic setting.
|
|
10:00-11:30, Paper WeAIP-02.10 | Add to My Program |
Online Human Capability Estimation through Reinforcement Learning and Interaction |
|
Sun, Chengke | University of Leeds |
Cohn, Anthony | University of Leeds |
Leonetti, Matteo | King's College London |
Keywords: Human-Robot Collaboration, Reinforcement Learning
Abstract: Service robots are expected to assist users in a constantly growing range of environments and tasks. People may be unique in many ways, and online adaptation of robots is central to personalized assistance. We focus on collaborative tasks in which the human collaborator may not be fully able-bodied, with the aim for the robot to automatically determine the best level of support. We propose a methodology for online adaptation based on Reinforcement Learning and Bayesian inference. As the Reinforcement Learning process continuously adjusts the robot's behavior, the actions that become part of the improved policy are used by the Bayesian inference module as local evidence of human capability, which can be generalized across the state space. The estimated capabilities are then used as pre-conditions to collaborative actions, so that the robot can quickly disable actions that the person seems unable to perform. We demonstrate and validate our approach on two simulated tasks and one real-world collaborative task across a range of motion and sensing capabilities.
|
|
10:00-11:30, Paper WeAIP-02.11 | Add to My Program |
Cognitive Approach to Hierarchical Task Selection for Human-Robot Interaction in Dynamic Environments |
|
Bukhari, Syed Tanweer Shah | University of Central Punjab |
Anima, Bashira Akter | University of Nevada, Reno |
Feil-Seifer, David | University of Nevada, Reno |
Qazi, Wajahat Mahmood | Intelligent Machines & Robotics Group, Department of Computer Sc |
Keywords: Human-Robot Collaboration, Cognitive Control Architectures, Human-Robot Teaming
Abstract: In an efficient and flexible human-robot collaborative work environment, a robot team member must be able to recognize both explicit requests and implied actions from human users. Identifying “what to do” in such cases requires an agent to have the ability to construct associations between objects, their actions, and the effect of actions on the environment. In this regard, semantic memory is being introduced to understand the explicit cues and their relationships with available objects and required skills to make “tea” and “sandwich”. We have extended our previous hierarchical robot control architecture to add the capability to execute the most appropriate task based on both feedback from the user and the environmental context. To validate this system, two types of skills were implemented in the hierarchical task tree: 1) Tea making skills and 2) Sandwich making skills. During the conversation between the robot and the human, the robot was able to determine the hidden context using ontology and began to act accordingly. For instance, if the person says “I am thirsty” or “It is cold outside” the robot will start to perform the tea-making skill. In contrast, if the person says, “I am hungry” or “I need something to eat”, the robot will make the sandwich. A humanoid robot Baxter was used for this experiment. We tested three scenarios with objects at different positions on the table for each skill. We observed that in all cases, the robot used only objects that were relevant to the skill.
|
|
10:00-11:30, Paper WeAIP-02.12 | Add to My Program |
Reward Shaping for Building Trustworthy Robots in Sequential Human-Robot Interaction |
|
Guo, Yaohui | University of Michigan, Ann Arbor |
Yang, X. Jessie | University of Michigan |
Shi, Cong | University of Miami |
Keywords: Human-Robot Teaming, Human-Centered Automation, Acceptability and Trust
Abstract: Trust-aware human-robot interaction (HRI) has received increasing research attention, as trust has been shown to be a crucial factor for effective HRI. Research in trust-aware HRI discovered a dilemma -- maximizing task rewards often leads to decreased human trust, while maximizing human trust would compromise task performance. In this work, we address this dilemma by formulating the HRI process as a two-player Markov game and utilizing the reward-shaping technique to improve human trust while limiting performance loss. Specifically, we show that when the shaping reward is potential-based, the performance loss can be bounded by the potential functions evaluated at the final states of the Markov game. We apply the proposed framework to the experience-based trust model, resulting in a linear program that can be efficiently solved and deployed in real-world applications. We evaluate the proposed framework in a simulation scenario where a human-robot team performs a search-and-rescue mission. The results demonstrate that the proposed framework successfully modifies the robot's optimal policy, enabling it to increase human trust at a minimal task performance cost.
|
|
10:00-11:30, Paper WeAIP-02.13 | Add to My Program |
Latent Emission-Augmented Perspective-Taking (LEAPT) for Human-Robot Interaction |
|
Chen, Kaiqi | National University of Singapore |
Lim, Jing Yu | National University of Singapore |
Kuan, Kingsley | National University of Singapore |
Soh, Harold | National University of Singapore |
Keywords: Human-Robot Collaboration, Probabilistic Inference, Representation Learning
Abstract: Perspective-taking is the ability to perceive or understand a situation or concept from another individual's point of view, and is crucial in daily human interactions. Enabling robots to perform perspective-taking remains an unsolved problem; existing approaches that use deterministic or handcrafted methods are unable to accurately account for uncertainty in partially-observable settings. This work proposes to address this limitation via a deep world model that enables a robot to perform both perception and conceptual perspective taking, i.e., the robot is able to infer what a human sees and believes. The key innovation is to leverage a decomposed multi-modal latent state space model to generate and augment fictitious observations. Optimizing the ELBO that arises from the underlying probabilistic graphical model enables the learning of uncertainty in latent space, which facilitates uncertainty estimation from high-dimensional observations. We tasked our model to predict human observations and beliefs on three partially-observable HRI tasks. Experiments show that our method significantly outperforms existing baselines and is able to infer visual observations available to other agent and their internal beliefs.
|
|
10:00-11:30, Paper WeAIP-02.14 | Add to My Program |
Robust and Context-Aware Real-Time Collaborative Robot Handling Via Dynamic Gesture Commands |
|
Chen, Rui | Carnegie Mellon University; University of Michigan; |
Shek, Alvin | Carnegie Mellon University |
Liu, Changliu | Carnegie Mellon University |
Keywords: Human-Robot Collaboration, Gesture, Posture and Facial Expressions, Learning from Demonstration
Abstract: This paper studies real-time collaborative robot (cobot) handling, where the cobot maneuvers an object under human dynamic gesture commands. Enabling dynamic gesture commands is useful when the human needs to avoid direct contact with the robot or the object handled by the robot. However, the key challenge lies in the heterogeneity in human behaviors and the stochasticity in the perception of dynamic gestures, which requires the robot handling policy to be adaptable and robust. To address these challenges, we introduce Conditional Collaborative Handling Process (CCHP) to encode a context-aware cobot handling policy and a procedure to learn such policy from human-human collaboration. We thoroughly evaluate the adaptability and robustness of CCHP and apply our approach to a real-time cobot assembly task with Kinova Gen3 robot arm. Results show that our method leads to significantly less human effort and smoother human-robot collaboration than state-of-the-art rule-based approach even with first-time users.
|
|
WeAIP-03 Regular session, Hall E |
Add to My Program |
Clone of 'Field Robots I' |
|
|
|
10:00-11:30, Paper WeAIP-03.1 | Add to My Program |
X-MAS: Extremely Large-Scale Multi-Modal Sensor Dataset for Outdoor Surveillance in Real Environments |
|
Noh, DongKi | LG Electronics Inc |
Sung, Chang Ki | KAIST |
Uhm, Taeyoung | Korean Institute of Robotics and Technology Convergence |
Lee, Wooju | KAIST |
Lim, Hyungtae | Korea Advanced Institute of Science and Technology |
Choi, Jaeseok | Seoul National University |
Lee, Kyuewang | Seoul National University |
Hong, Dasol | KAIST |
Um, Daeho | Seoul National University |
Chung, Inseop | Seoul National University |
Shin, Hochul | Electronics and Telecommunications Research Institute |
Kim, Min-Jung | KAIST |
Kim, Hyoung-Rock | LG Electronics Co. Advanced Research Institute |
Baek, Seung-Min | LG Electronics |
Myung, Hyun | KAIST (Korea Advanced Institute of Science and Technology) |
Keywords: Surveillance Robotic Systems, Field Robots, Data Sets for Robot Learning
Abstract: In robotics and computer vision communities, extensive studies have been widely conducted regarding surveillance tasks, including human detection, tracking, and motion recognition with a camera. Additionally, deep learning algorithms are widely utilized in the aforementioned tasks as in other computer vision tasks. Existing public datasets are insufficient to develop learning-based methods that handle various surveillance for outdoor and extreme situations such as harsh weather and low illuminance conditions. Therefore, we introduce a new large-scale outdoor surveillance dataset named eXtremely large-scale Multi-modAl Sensor dataset (XMAS) containing more than 500,000 image pairs and the first-person view data annotated by well-trained annotators. Moreover, a single pair contains multi-modal data (e.g. an IR image, an RGB image, a thermal image, a depth image, and a LiDAR scan). This is the first large-scale first-person view outdoor multi-modal dataset focusing on surveillance tasks to the best of our knowledge. We present an overview of the proposed dataset with statistics and present methods of exploiting our dataset with deep learning-based algorithms. The latest information on the dataset and our study are available at https://github.com/lge-robot-navi, and the dataset will be available for download through a server.
|
|
10:00-11:30, Paper WeAIP-03.2 | Add to My Program |
Athletic Mobile Manipulator System for Robotic Wheelchair Tennis |
|
Zaidi, Zulfiqar | Georgia Institute of Technology |
Martin, Daniel | Georgia Institute of Technology |
Belles, Nathaniel | Georgia Institute of Technology |
Zakharov, Viacheslav | Georgia Institute of Technology |
Krishna, Arjun | Georgia Institute of Technology |
Lee, Kin Man | Georgia Institute of Technology |
Wagstaff, Peter | Georgia Institute of Technology |
Naik, Sumedh | Georgia Institute of Technology |
Sklar, Matthew | Georgia Institute of Technology |
Choi, Sugju | Georgia Institute of Technology |
Kakehi, Yoshiki | Georgia Institute of Technology |
Patil, Ruturaj | Georgia Institute of Technology |
Mallemadugula, Divya | Georgia Institute of Technology |
Pesce, Florian | Georgia Institute of Technology |
Wilson, Peter | Georgia Institute of Technology |
Hom, Wendell | Georgia Institute of Technology |
Diamond, Matan | Georgia Institute of Technology |
Zhao, Bryan | Georgia Institute of Technology |
Moorman, Nina | Georgia Institute of Technology |
Paleja, Rohan | Georgia Institute of Technology |
Chen, Letian | Georgia Institute of Technology |
Seraj, Esmaeil | Georgia Institute of Technology |
Gombolay, Matthew | Georgia Institute of Technology |
Keywords: Engineering for Robotic Systems
Abstract: Athletics are a quintessential and universal expression of humanity.From French monks who in the 12th century invented jeu de paume, the precursor to modern lawn tennis, back to the K'iche' people who played the Maya Ballgame as a form of religious expression over three thousand years ago, humans have sought to train their minds and bodies to excel in sporting contests. Advances in robotics are opening up the possibility of robots in sports. Yet, key challenges remain, as most prior works in robotics for sports are limited to pristine sensing environments, do not require significant force generation, or are on miniaturized scales unsuited for joint human-robot play. In this paper, we propose the first open-source, autonomous robot for playing regulation wheelchair tennis. We demonstrate the performance of our full-stack system in executing ground strokes and evaluate each of the system's hardware and software components. The goal of this paper is to (1) inspire more research in human-scale robot athletics and (2) establish the first baseline for a reproducible wheelchair tennis robot for regulation singles play. Our paper contributes to the science of systems design and poses a set of key challenges for the robotics community to address in striving towards robots that can match human capabilities in sports.
|
|
10:00-11:30, Paper WeAIP-03.3 | Add to My Program |
An Attentional Recurrent Neural Network for Occlusion-Aware Proactive Anomaly Detection in Field Robot Navigation |
|
Schreiber, Andre | University of Illinois Urbana-Champaign |
Ji, Tianchen | University of Illinois at Urbana-Champaign |
McPherson, D. Livingston | University of Illinois |
Driggs-Campbell, Katherine | University of Illinois at Urbana-Champaign |
Keywords: Robotics and Automation in Agriculture and Forestry, Sensor Fusion, Failure Detection and Recovery
Abstract: The use of mobile robots in unstructured environments like the agricultural field is becoming increasingly common. The ability for such field robots to proactively identify and avoid failures is thus crucial for ensuring efficiency and avoiding damage. However, the cluttered field environment introduces various sources of noise (such as sensor occlusions) that make proactive anomaly detection difficult. Existing approaches can show poor performance in sensor occlusion scenarios as they typically do not explicitly model occlusions and only leverage current sensory inputs. In this work, we present an attention-based recurrent neural network architecture for proactive anomaly detection that fuses current sensory inputs and planned control actions with a latent representation of prior robot state. We enhance our model with an explicitly-learned model of sensor occlusion that is used to modulate the use of our latent representation of prior robot state. Our method shows improved anomaly detection performance and enables mobile field robots to display increased resilience to predicting false positives regarding navigation failure during periods of sensor occlusion, particularly in cases where all sensors are briefly occluded. Our code is available at: https://github.com/andreschreiber/roar.
|
|
10:00-11:30, Paper WeAIP-03.4 | Add to My Program |
Collaborative Trolley Transportation System with Autonomous Nonholonomic Robots |
|
Xia, Bingyi | Southern University of Science and Technology |
Luan, Hao | National University of Singapore |
Zhao, Ziqi | Southern University of Science and Technology |
Gao, Xuheng | Southern University of Science and Technology |
Xie, Peijia | Southern University of Science and Technology |
Xiao, Anxing | National University of Singapore |
Wang, Jiankun | Southern University of Science and Technology |
Meng, Max Q.-H. | The Chinese University of Hong Kong |
Keywords: Automation Technologies for Smart Cities, Intelligent Transportation Systems, Multi-Robot Systems
Abstract: Cooperative object transportation using multiple robots has been intensively studied in the control and robotics literature, but most approaches are either only applicable to omnidirectional robots or lack a complete navigation and decision-making framework that operates in real time. This paper presents an autonomous nonholonomic multi-robot system and an end-to-end hierarchical autonomy framework for collaborative luggage trolley transportation. This framework finds kinematic-feasible paths, computes online motion plans, and provides feedback that enables the multi-robot system to handle long lines of luggage trolleys and navigate obstacles and pedestrians while dealing with multiple inherently complex and coupled constraints. We demonstrate the designed collaborative trolley transportation system through practical transportation tasks, and the experiment results reveal their effectiveness and reliability in complex and dynamic environments.
|
|
10:00-11:30, Paper WeAIP-03.5 | Add to My Program |
Deconfounded Opponent Intention Inference for Football Multi-Player Policy Learning |
|
Wang, Shijie | Institute of Automation, Chinese Academy of Sciences |
Pan, Yi | Institute of Automation, Chinese Academy of Sciences |
Pu, Zhiqiang | University of Chinese Academy of Sciences; Institute of Automati |
Liu, Boyin | University of Chinese Academy of Sciences School of Artificial I |
Yi, Jianqiang | Chinese Academy of Sciences |
Keywords: Agent-Based Systems, Reinforcement Learning, Probabilistic Inference
Abstract: Due to the high complexity of a football match, the opponents' strategies are variable and unknown. Thus predicting the opponents' future intentions accurately based on current situation is crucial for football players' decision-making. To better anticipate the opponents and learn more effective strategies, a deconfounded opponent intention inference (DOII) method for football multi-player policy learning is proposed in this paper. Specifically, opponents' intentions are inferred by an opponent intention supervising module. Furthermore, for some confounders which affect the causal relationship among the players and the opponents, a deconfounded trajectory graph module is designed to mitigate the influence of these confounders and increase the accuracy of the inferences about opponents' intentions. Besides, an opponent-based incentive module is designed to improve the players' sensitivity to the opponents' intentions and further to train reasonable players' strategies. Representative results indicate that DOII can effectively improve the performance of players' strategies in the Google Research Football environment, which validates the superiority of the proposed method.
|
|
10:00-11:30, Paper WeAIP-03.6 | Add to My Program |
Stroke-Based Rendering and Planning for Robotic Performance of Artistic Drawing |
|
Ilinkin, Ivaylo | Gettysburg College |
Song, Daeun | Ewha Womans University |
Kim, Young J. | Ewha Womans University |
Keywords: Art and Entertainment Robotics, Simulation and Animation, Virtual Reality and Interfaces
Abstract: We present a new robotic drawing system based on stroke-based rendering (SBR). Our motivation is the artistic quality of the whole performance. Not only should the generated strokes in the final drawing resemble the input image, but the stroke sequence should also exhibit a human artist’s planning process. Thus, when a robot executes the drawing task, both the drawing results and the way the robot executes would look artistic. Our SBR system is based on image segmentation and depth estimation. It generates the drawing strokes in an order that allows for the intended shape to be perceived quickly and for its detailed features to be filled in and emerge gradually when observed by the human. This ordering represents a stroke plan that the drawing robot should follow to create an artistic rendering of images. We experimentally demonstrate that our SBR-based drawing makes visually pleasing artistic images, and our robotic system can replicate the result with proper sequences of stroke drawing.
|
|
10:00-11:30, Paper WeAIP-03.7 | Add to My Program |
Heterogeneous Robot-Assisted Services in Isolation Wards: A System Development and Usability Study |
|
Kwon, Youngsun | Electronics and Telecommunications Research Institute |
Shin, Soyeon | LG Electronics |
Yang, Kyon-Mo | Korea Institute of Robot and Convergence |
Park, Seongah | Korea Institute of Science and Technology (KIST) |
Shin, Soomin | KIST |
Jeon, Hwawoo | Korea Institute of Science and Technology, and Hanyang Univ |
Kim, Kijung | Korea Institute of Science and Technology |
Yun, Guhnoo | Korea Institute of Science and Technology |
Park, Sang Yong | Korea Institute of Science and Technology |
Byun, Jeewon | Softnet |
Kang, Sang Hoon | Ulsan National Institute of Science and Technology(UNIST) / U. O |
Song, Kyoung-Ho | Seoul National University Bundang Hospital |
Kim, Doik | KIST |
Kim, Dong Hwan | Korea Institute of Science and Technology |
Seo, Kap-Ho | Korea Institute of Robot and Convergence |
Kwak, Sonya Sona | Korea Institute of Science and Technology (KIST) |
Lim, Yoonseob | Korea Institute of Science and Technology |
Keywords: Service Robotics, Product Design, Development and Prototyping, Software-Hardware Integration for Robot Systems
Abstract: Isolation wards operate in quarantine rooms to prevent cross-contamination caused by infectious diseases. Behind the benefits, medical personnel can have the infection risk from patients and the heavy workload due to the isolation. This work proposes a robot-assisted system to alleviate these problems in isolation wards. We conducted a survey about the medical staff's difficulties and envisioning robots. Using the investigation result, we devised three valuable services using two kinds of heterogeneous robots: telemedicine, emergency alert, and delivery services by care robots and delivery robots. Our system also provides user-interactive components such as a dashboard for medical staff and a patient app for inpatients. To manage the services efficiently, we suggest the robotic system based on a central control server and a hierarchical management architecture. Through a user study, we reviewed the usability of the developed system and its future directions.
|
|
10:00-11:30, Paper WeAIP-03.8 | Add to My Program |
Irregular Change Detection in Sparse Bi-Temporal Point Clouds Using Learned Place Recognition Descriptors and Point-To-Voxel Comparison |
|
Stathoulopoulos, Nikolaos | Luleĺ University of Technology, Robotics and AI Group |
Koval, Anton | Luleĺ University of Technology |
Nikolakopoulos, George | Luleĺ University of Technology |
Keywords: Field Robots, Mining Robotics, AI-Based Methods
Abstract: Change detection and irregular object extraction in 3D point clouds is a challenging task that is of high importance not only for autonomous navigation but also for updating existing digital twin models of various industrial environments. This article proposes an innovative approach for change detection in 3D point clouds using deep learned place recognition descriptors and irregular object extraction based on voxel-to-point comparison. The proposed method first aligns the bi-temporal point clouds using a map-merging algorithm in order to establish a common coordinate frame. Then, it utilizes deep learning techniques to extract robust and discriminative features from the 3D point cloud scans, which are used to detect changes between consecutive point cloud frames and therefore find the changed areas. Finally, the altered areas are sampled and compared between the two time instances to extract any obstructions that caused the area to change. The proposed method was successfully evaluated in real-world field experiments, where it was able to detect different types of changes in 3D point clouds, such as object or muck-pile addition and displacement, showcasing the effectiveness of the approach. The results of this study demonstrate important implications for various applications, including safety and security monitoring in construction sites, mapping and exploration and suggests potential future research directions in this field.
|
|
10:00-11:30, Paper WeAIP-03.9 | Add to My Program |
Magnetically Controlled Cell Robots with Immune-Enhancing Potential |
|
Sun, Hongyan | Beihang University |
Dai, Yuguo | Beihang University |
Zhang, Jiaying | Beihang University, School of Mechanical Engineering &Automation |
Xu, Junjie | Beijing University of Aeronautics and Astronautics |
Lina, Jia | BEIHANG UNIVERSITY |
Wang, Chutian | Beihang University |
Wang, Luyao | Beihang University |
Li, Chan | Beihang University |
Bai, Xue | School of Mechanical Engineering & Automation, Beihang Universit |
Chen, Bo | School of Mechanical Engineering & Automation, Beihang Universit |
Feng, Lin | Beihang University |
Keywords: Micro/Nano Robots, Cellular and Modular Robots, Field Robots
Abstract: Magnetic microrobots exhibit enormous potential in targeted drug delivery owing to the remote wireless manipulation and minimum invasion for medical treatment. High degree of freedom offers the magnetic propelled robots extraordinary application prospect since they can be controlled precisely when different magnetic fields sources working cooperatively. However, the biocompatibility of microrobots have attracted sustained and general concern. Therefore, it is highly necessary to develop a promising carrier with high biocompatibility and investigate the mechanism of drug loading-release triggered by special microenvironment in the targeted region. In this paper, we proposed a magnetically controlled cell robots (MCRs) based on macrophages propelled by a rotating magnetic field. The innovative MCRs exhibit good biocompatibility and low toxicity by optimizing the concentration of polylysine-coated Fe nanoparticles (PLL@FeNPs) to 40 μg/mL. These MCRs loaded with murine interleukin-12 (IL-12), murine chemokine (C-C motif) ligand 5 (CCL-5), and murine C-X-C motif chemokine ligand 10 (CXCL-10) which can stimulate T cell differentiation and recruitment of monocytes, respectively. The macrophages showed an obvious M1-polarization tendency of macrophages to phagocytose intracellular pathogens and resist the growth of tumor cells. Under the control of a magnetic propelling system composed of 3 pairs of Helmholtz coil, the cell robot can be propelled wirelessly and moved along a predefined path with high accuracy. Moreover, the MCRs could approach to cancer cells and stop at places of interest in vitro. In conclusion, we have accomplished the preliminary construction of a targeted drug delivery system which displays great immune-enhancing potential for targeted drug delivery.
|
|
10:00-11:30, Paper WeAIP-03.10 | Add to My Program |
Tightly-Coupled Visual-DVL Fusion for Accurate Localization of Underwater Robots |
|
Huang, Yupei | Institute of Automation, Chinese Academy of Sciences |
Li, Peng | Institute of Automation, Chinese Academy of Sciences |
Yan, Shuaizheng | Institute of Automation, Chinese Academy of Sciences |
Ou, Yaming | Chinese Academy of Sciences |
Wu, Zhengxing | Chinese Academy of Sciences |
Tan, Min | Institute of Automation, Chinese Academy of Sciences |
Yu, Junzhi | Chinese Academy of Sciences |
Keywords: Marine Robotics, Field Robots, Localization
Abstract: This paper proposes a tightly-coupled visual-Doppler-Velocity-Log (visual-DVL) fusion method for underwater robot localization through integrating the velocity measurements from a DVL into a visual odometry (VO). Considering that employing the DVL measurements in dead-reckoning systems easily leads to error accumulation and suboptimal results in previous works, we directly integrate them into the visual tracking process. Specifically, the velocity measurements are utilized to improve the initial estimation of camera pose during visual tracking, aiming to provide a better initial value for pose optimization. Thereafter, these velocity measurements are also directly employed to constrain the position change of the camera between two adjacent frames by constructing a novel DVL error term, which is optimized jointly with the visual constrains to obtain a more accurate camera pose. Various experiments are carried out in the datasets collected from several scenarios of the underwater simulation environment HoloOcean, and the results illustrate that the proposed fusion method can effectively improve the localization accuracy for underwater robots by about 20% compared to pure visual odometry. The proposed method provides valuable guidance for the accurate localization of underwater robots.
|
|
10:00-11:30, Paper WeAIP-03.11 | Add to My Program |
Fully Proprioceptive Slip-Velocity-Aware State Estimation for Mobile Robots Via Invariant Kalman Filtering and Disturbance Observer |
|
Yu, Xihang | University of Michigan |
Teng, Sangli | University of Michigan, Ann Arbor |
Chakhachiro, Theodor | American University of Beirut |
Tong, Wenzhe | University of Michigan, Ann Arbor |
Li, Tingjun | University of Michigan |
Lin, Tzu-Yuan | University of Michigan |
Koehler, Sarah | University of California, Berkeley |
Ahumada, Manuel | Toyota Research Institute |
Walls, Jeffrey | University of Michigan |
Ghaffari, Maani | University of Michigan |
Keywords: Wheeled Robots, Localization, Field Robots
Abstract: This paper develops a novel slip estimator using the invariant observer design theory and Disturbance Observer (DOB). The proposed state estimator for mobile robots is fully proprioceptive and combines data from an inertial measurement unit and body velocity within a Right Invariant Extended Kalman Filter (RI-EKF). By embedding the slip velocity into SE3(3) matrix Lie group, the developed DOB-based RI-EKF provides real-time velocity and slip velocity estimates on different terrains. Experimental results using a Husky wheeled robot confirm the mathematical derivations and effectiveness of the proposed method in estimating the observable state variables. Open-source software is available for download and reproducing the presented results.
|
|
10:00-11:30, Paper WeAIP-03.12 | Add to My Program |
Predicting Energy Consumption and Traversal Time of Ground Robots for Outdoor Navigation on Multiple Types of Terrain |
|
Eder, Matthias | Graz University of Technology |
Steinbauer-Wagner, Gerald | Graz University of Technology |
Keywords: Field Robots, Autonomous Vehicle Navigation, Energy and Environment-Aware Automation
Abstract: The outdoor navigation capabilities of ground robots have improved significantly in recent years, opening up new potential applications in a variety of settings. Cost-based representations of the environment are frequently used in the path planning domain to obtain an optimized path based on various objectives, such as traversal time or energy consumption. However, obtaining such cost representations is still cumbersome, particularly in outdoor settings with diverse terrain types and slope angles. In this paper, we address this problem by using a data-driven approach to develop a cost representation for various outdoor terrain types that supports two optimization objectives, namely energy consumption and traversal time. We train a supervised machine learning model whose inputs consists of extracted environment data along a path and whose outputs are the predicted energy consumption and traversal time. The model is based on a ResNet neural network architecture and trained using field-recorded data. The error of the proposed method on different types of terrain is within 11% of the ground truth data. To show that it performs and generalizes better than currently existing approaches on various types of terrain, a comparison to a baseline method is made.
|
|
10:00-11:30, Paper WeAIP-03.13 | Add to My Program |
Informative Path Planning for Scalar Dynamic Reconstruction Using Coregionalized Gaussian Processes and a Spatiotemporal Kernel |
|
Booth, Lorenzo A. | University of California Merced |
Carpin, Stefano | University of California, Merced |
Keywords: Agricultural Automation, Robotics and Automation in Agriculture and Forestry, Planning, Scheduling and Coordination
Abstract: The proliferation of unmanned vehicles offers many opportunities for solving environmental sampling tasks with applications in resource monitoring and precision agriculture. Informative path planning (IPP) includes a family of methods which offer improvements over traditional surveying techniques for suggesting locations for observation collection. In this work, we present a novel solution to the IPP problem by using a coregionalized Gaussian processes to estimate a dynamic scalar field that varies in space and time. Our method improves previous approaches by using a composite kernel accounting for spatiotemporal correlations and at the same time, can be readily incorporated in existing IPP algorithms. Through extensive simulations, we show that our novel modeling approach leads to more accurate estimations when compared with formerly proposed methods that do not account for the temporal dimension.
|
|
WeAIP-04 Regular session, Hall E |
Add to My Program |
Clone of 'Optimization and Optimal Control I' |
|
|
|
10:00-11:30, Paper WeAIP-04.1 | Add to My Program |
Probabilistic Guarantees for Nonlinear Safety-Critical Optimal Control |
|
Akella, Prithvi | California Institute of Technology |
Ubellacker, Wyatt | California Institute of Technology |
Ames, Aaron | Caltech |
Keywords: Probability and Statistical Methods, Optimization and Optimal Control, Robot Safety
Abstract: Leveraging recent developments in black-box risk-aware verification, we provide three algorithms that generate probabilistic guarantees on (1) optimality of solutions, (2) recursive feasibility, and (3) maximum controller runtimes for general nonlinear safety-critical finite-time optimal controllers. These methods forego the usual (perhaps) restrictive assumptions required for typical theoretical guarantees, e.g. terminal set calculation for recursive feasibility in Nonlinear Model Predictive Control, or convexification of optimal controllers to ensure optimality. Furthermore, we show that these methods can directly be applied to hardware systems to generate controller guarantees on their respective systems.
|
|
10:00-11:30, Paper WeAIP-04.2 | Add to My Program |
Learning from Human Directional Corrections (I) |
|
Jin, Wanxin | Arizona State University |
Murphey, Todd | Northwestern University |
Lu, Zehui | Purdue University |
Mou, Shaoshuai | Purdue University |
Keywords: Optimization and Optimal Control, Physical Human-Robot Interaction, Motion and Path Planning, Human Factors and Human-in-the-Loop
Abstract: This paper proposes a novel approach that enables a robot to learn an objective function incrementally from human directional corrections. Existing methods learn from human magnitude corrections; since a human needs to carefully choose the magnitude of each correction, those methods can easily lead to over-corrections and learning inefficiency. The proposed method only requires human directional corrections --- corrections that only indicate the direction of an input change without indicating its magnitude. We only assume that each correction, regardless of its magnitude, points in a direction that improves the robot current motion relative to an implicit objective function. For each directional correction, the proposed method updates the estimate of the objective function based on a cutting plane method, which has a geometric interpretation. We have established theoretical results to show the convergence of the learning process. The proposed method has been tested in numerical examples, a user study on two human-robot games, and a real-world quadrotor experiment.
|
|
10:00-11:30, Paper WeAIP-04.3 | Add to My Program |
Non-Gaussian Uncertainty Minimization Based Control of Stochastic Nonlinear Robotic Systems |
|
Han, Weiqiao | Massachusetts Institute of Technology |
M. Jasour, Ashkan | MIT |
Williams, Brian | MIT |
Keywords: Optimization and Optimal Control, Motion and Path Planning, Probability and Statistical Methods
Abstract: In this paper, we consider the closed-loop control problem of nonlinear robotic systems in the presence of probabilistic uncertainties and disturbances. More precisely, we design a state feedback controller that minimizes deviations of the states of the system from the nominal state trajectories due to uncertainties and disturbances. Existing approaches to address the control problem of probabilistic systems are limited to particular classes of uncertainties and systems such as Gaussian uncertainties and processes and linearized systems. We present an approach that deals with nonlinear dynamics models and arbitrary known probabilistic uncertainties. We formulate the controller design problem as an optimization problem in terms of statistics of the probability distributions including moments and characteristic functions. In particular, in the provided optimization problem, we use moments and characteristic functions to propagate uncertainties throughout the nonlinear motion model of robotic systems. In order to reduce the tracking deviations, we minimize the uncertainty of the probabilistic states around the nominal trajectory by minimizing the trace and the determinant of the covariance matrix of the probabilistic states. To obtain the state feedback gains, we solve deterministic optimization problems in terms of moments, characteristic functions, and state feedback gains using off-the-shelf interior-point optimization solvers. To illustrate the performance of the proposed method, we compare our method with existing probabilistic control methods.
|
|
10:00-11:30, Paper WeAIP-04.4 | Add to My Program |
Learning Compliant Stiffness by Impedance Control Aware Task Segmentation and Multi-Objective Bayesian Optimization with Priors |
|
Okada, Masashi | Panasonic Holdings Corporation |
Komatsu, Mayumi | Panasonic Corp |
Okumura, Ryo | Panasonic Holdings Corporation |
Taniguchi, Tadahiro | Ritsumeikan University |
Keywords: Optimization and Optimal Control, Compliance and Impedance Control, Learning from Demonstration
Abstract: Rather than traditional position control, impedance control is preferred to ensure the safe operation of industrial robots programmed from demonstrations. However, variable stiffness learning studies have focused on task performance rather than safety (or compliance). Thus, this paper proposes a novel stiffness learning method to satisfy both task performance and compliance requirements. The proposed method optimizes the task and compliance objectives (T/C objectives) simultaneously via multi-objective Bayesian optimization. We define the stiffness search space by segmenting a demonstration into task phases, each with constant responsible stiffness. The segmentation is performed by identifying impedance control-aware switching linear dynamics (IC-SLD) from the demonstration. We also utilize the stiffness obtained by proposed IC-SLD as priors for efficient optimization. Experiments on simulated tasks and a real robot demonstrate that IC-SLD-based segmentation and the use of priors improve the optimization efficiency compared to existing baseline methods.
|
|
10:00-11:30, Paper WeAIP-04.5 | Add to My Program |
Efficient Exploration Using Extra Safety Budget in Constrained Policy Optimization |
|
Xu, Haotian | Tsinghua University |
Wang, Shengjie | Tsinghua University |
Wang, Zhaolei | Beijing Aerospace Automatic Control Institute |
Zhang, Yunzhe | Tsinghua University |
Zhuo, Qing | TSINGHUA University |
Gao, Yang | Tsinghua University |
Zhang, Tao | Tsinghua University |
Keywords: Optimization and Optimal Control, Reinforcement Learning, Motion and Path Planning
Abstract: Reinforcement learning (RL) has achieved promising results on most robotic control tasks. Safety of learning-based controllers is an essential notion of ensuring the effectiveness of the controllers. Current methods adopt whole consistency constraints during the training, thus resulting in inefficient exploration in the early stage. In this paper, we propose an algorithm named Constrained Policy Optimization with Extra Safety Budget (ESB-CPO) to strike a balance between the exploration efficiency and the constraints satisfaction. In the early stage, our method loosens the practical constraints of unsafe transitions (adding extra safety budget) with the aid of a new metric we propose. With the training process, the constraints in our optimization problem become tighter. Meanwhile, theoretical analysis and practical experiments demonstrate that our method gradually meets the cost limit’s demand in the final training stage. When evaluated on Safety-Gym and Bullet-Safety-Gym benchmarks, our method has shown its advantages over baseline algorithms in terms of safety and optimality. Remarkably, our method gains remarkable performance improvement under the same cost limit compared with baselines.
|
|
10:00-11:30, Paper WeAIP-04.6 | Add to My Program |
Single-Level Differentiable Contact Simulation |
|
Le Cleac'h, Simon | Stanford University |
Schwager, Mac | Stanford University |
Manchester, Zachary | Carnegie Mellon University |
Sindhwani, Vikas | Google Brain, NYC |
Florence, Peter | MIT |
Singh, Sumeet | Google |
Keywords: Optimization and Optimal Control, Simulation and Animation, Dynamics
Abstract: We present a differentiable formulation of rigid-body contact dynamics for objects and robots represented as compositions of convex primitives. Existing optimization-based approaches simulating contact between convex primitives rely on a bilevel formulation that separates collision detection and contact simulation. These approaches are unreliable in realistic contact simulation scenarios because isolating the collision detection problem introduces contact location non-uniqueness. Our approach combines contact simulation and collision detection into a unified single-level optimization problem. This disambiguates the collision detection problem in a physics-informed manner. Our formulation features improved simulation robustness and a reduction in computational complexity when compared to a similar differentiable simulation baseline. We illustrate the contact and collision differentiability on a robotic manipulation task requiring optimization-through-contact. We provide a numerically efficient implementation of our formulation in the Julia language called Silico.jl.
|
|
10:00-11:30, Paper WeAIP-04.7 | Add to My Program |
Quadratic Dynamic Matrix Control for Fast Cloth Manipulation |
|
Caldarelli, Edoardo | Institut De Robňtica I Informŕtica Industrial (CSIC-UPC) |
Colomé, Adriŕ | Institut De Robňtica I Informŕtica Industrial (CSIC-UPC), Q28180 |
Ocampo-Martinez, Carlos | Universitat Politčcnica De Catalunya - BarcelonaTECH (UPC) |
Torras, Carme | Csic - Upc |
Keywords: Optimization and Optimal Control, Motion Control, Probability and Statistical Methods
Abstract: Robotic cloth manipulation is an increasingly relevant area of research, challenging classic control algorithms due to the deformable nature of cloth. While it is possible to apply linear model predictive control to make the robot move the cloth according to a given reference, this approach suffers from a large dimensionality of the state-space representation of the cloth models. To address this issue, in this work we study the application of an input-output model predictive control strategy, based on quadratic dynamic matrix control, to robotic cloth manipulation. To account for uncertain disturbances on the cloth's motion, we further extend the algorithm with suitable chance constraints. In extensive simulated experiments, involving disturbances and obstacle avoidance, we show that quadratic dynamic matrix control can be successfully applied in different cloth manipulation scenarios, with significant gains in optimization speed compared to standard model predictive control strategies. The experiments further demonstrate that the closed-loop model used by quadratic dynamic matrix control can be beneficial to the tracking accuracy, leading to improvements over the standard predictive control strategy. Moreover, a preliminary experiment on a real robot shows that quadratic dynamic matrix control can indeed be employed in real settings.
|
|
10:00-11:30, Paper WeAIP-04.8 | Add to My Program |
A Gaussian Process Model for Opponent Prediction in Autonomous Racing |
|
Zhu, Edward | University of California, Berkeley |
Busch, Finn Lukas | Hamburg University of Technology |
Johnson, Jake | University of California, Berkeley |
Borrelli, Francesco | University of California, Berkeley |
Keywords: Probabilistic Inference, Optimization and Optimal Control, Machine Learning for Robot Control
Abstract: In head-to-head racing, performing tightly constrained, but highly rewarding maneuvers, such as overtaking, require an accurate model of interactive behavior of the opposing target vehicle (TV). We propose to construct a prediction model given data of the TV from previous races. In particular, a one-step Gaussian process (GP) model is trained on closed-loop interaction data to learn the behavior of a TV driven by an unknown policy. Predictions of the nominal trajectory and associated uncertainty are rolled out via a sampling-based approach and are used in a model predictive control (MPC) policy for the ego vehicle in order to intelligently trade-off between safety and performance when racing against a TV. In a Monte Carlo study, we compare the GP-based predictor in closed-loop with the MPC policy against several predictors from literature and observe that the GP-based predictor achieves similar win rates while maintaining safety in up to 3x more races. Through experiments, we demonstrate the approach in real-time on a 1/10th scale racecar platform operating at speeds of around 2.8 m/s, and show a significant level of improvement when using the GP-based predictor over a baseline MPC predictor. Videos of the experiments can be found at https://youtu.be/KMSs4ofDfIs.
|
|
10:00-11:30, Paper WeAIP-04.9 | Add to My Program |
Optimal Energy Tank Initialization for Minimum Sensitivity to Model Uncertainties |
|
Pupa, Andrea | University of Modena and Reggio Emilia |
Robuffo Giordano, Paolo | Irisa Cnrs Umr6074 |
Secchi, Cristian | Univ. of Modena & Reggio Emilia |
| |