bridging the human-robot interaction gap
Autonomous Unmanned Aerial Vehicles (UAVs) are rapidly transforming industries requiring inspection and surveillance. However, conventional UAV systems often require complex control schemes and lack adaptability, limiting their efficacy in variable environments such as indoor inspections. This paper introduces an innovative system integrating the cutting-edge Generative Pretrained Transformer (GPT) models and dense captioning models for autonomous navigation and fault detection in indoor environments. Our approach, displaying human-like flexibility, allows the drone to interpret and respond to natural language commands, vastly enhancing its accessibility and user-friendliness. Simultaneously, the drone utilizes object dictionaries derived from dense captioning of its captured images, facilitating an advanced understanding of its surroundings. These capabilities equip the drone to adapt its behavior and effectively handle unexpected scenarios, significantly enhancing the efficiency and accuracy of indoor inspections. This research contributes to revolutionizing building inspections, making the process more user-friendly, and localizable to a broader user base.
Primary Author
JAN 2023 - AUG 2023
Kaiwen Chen
Yining Wen