Reinforcement Learning Drone Racing

I built an end-to-end reinforcement learning system that teaches a quadrotor to race through a sequence of gates at high speed using a custom Proximal Policy Optimization (PPO) setup in NVIDIA Isaac Sim. The policy maps simulated onboard observations directly to motor commands, learning smooth, directional, and crash-free flight over multiple laps.

engine

I implemented PPO from scratch (with clipping, GAE, and KL control), designed a compact gate-relative observation space, and crafted dense + sparse rewards for progress, heading alignment, and clean gate passes. I also added a curriculum-style reset scheme that randomizes which gate the drone starts near and its position/heading around that gate, plus domain randomization over thrust-to-weight, drag, and inner-loop gains at each reset, so the policy becomes robust to changing dynamics and better suited for sim-to-real transfer.

👉 GitHub Repository

Robotics Graduate Student