Deep Q-Network Navigation in PyBullet

Project Overview

This project independently implements a Deep Q-Network (DQN) agent for autonomous navigation in a 2D PyBullet simulation. The agent learns goal-reaching behavior under uncertainty using value-based reinforcement learning with discrete motion commands, reward shaping, and exploration decay. The environment extends the simple-car-env-template and includes visual diagnostics for training evaluation. Developed as part of an advanced robotics module, this project demonstrates practical skills in learning-based control, simulation engineering, and agent diagnostics.

Demonstration

This section presents a visual walkthrough of the learning process and policy behavior, highlighting both theoretical underpinnings and practical results.

The Markov Decision Process (MDP) framework provides the foundation for value-based reinforcement learning methods like DQN.

Policy evaluation shows the agent navigating efficiently toward the goal while avoiding dynamic obstacles.

The reward curve demonstrates steady policy improvement over training episodes, validating the agent's learning progression.

The ε-decay curve reflects the agent’s shift from exploratory to exploitative behavior as training proceeds.

Methods

✓ Discrete DQN agent with 9 control actions and a multilayer perceptron (MLP) policy network
✓ Simulation environment extended from simple-car-env-template with dynamic obstacle generation
✓ Reward shaping based on goal proximity, collision penalties, and sparse terminal rewards
✓ Epsilon-greedy exploration with exponential decay for efficient policy convergence
✓ Evaluation based on success thresholds, cumulative rewards, and behavioral visualization

📄 View Report 💻 View on GitHub

← Return to Portfolio