Saif Alabachi · AAMAS
Leveraging Deep Learning Models to Create a Natural Interface for Quadcopter Photography
A quadcopter can capture photos from vantage points unattainable for a human photographer, but teloperating it to a good viewpoint is a non-trivial task. Since humans are good at composing photos, the aim of our research is to leverage deep learning to create a customizable flight controller that can capture photos under the guidance of a human photographer. Our system, the Selfie Drone Stick, allows the user to assign a vantage point to the quadcopter based on the phone's sensors. The user takes a selfie with the phone once, and the quadcopter autonomously flies to the target viewpoint. The proliferation of open source deep learning models provided us with a large variety of options for the computer vision and flight control systems. This article describes three key innovations required to deploy the models on a real robot: 1) a new architecture for rapid object detection, DUNet; 2) an abstract state representation for transferring learning from simulation to the hardware platform; 3) reward shaping and staging paradigms for training a deep reinforcement learning controller. Without these improvements, we were unable to learn a flight controller that adequately supported the intuitive user interface.