This is a continuation of my dissertation project for building an autonomous quadrotor. Difference from my original implementation is that this uses sb3 algorithm to train and also, yaw axis acceleration is also a consideration in the reward. The yaw axis reward was necessary because otherwise, the quadrotor was spinning about that axis uncontrollably.
There is a glitch in the video below due to incorrect use of ffmpeg for converting images to mp4. Please ignore!