This is introduction tutorial to Reinforcement Learning. To understand everything from basics I will start with simple game called - CartPole.
Cartpole Double DQN
This is second reinforcement tutorial part, where we'll make our environment to use two (Double) neural networks to train our main model.
Cartpole Dueling DDQN
In this post, we’ll be covering Dueling DQN networks for reinforcement learning. This architecture is an improvement of our previous DDQN tutorial.
In this part we'll cover Epsilon Greedy method used in Deep Q Learning and we'll fix/prepare our source code for PER method.
Prioritized Experience Replay
Now we will try to change the sampling distribution by using a criterion to define the priority of each tuple of experience.
DQN PER with CNN
Now I will show you how to implement DQN with CNN. After this tutorial, you'll be able to create an agent that successfully plays almost ‘any’ game using only pixel inputs.
Pong with DQN
In this tutorial, I'll implement a Deep Neural Network for Reinforcement Learning (Deep Q Network) and we will see it learns and finally becomes good enough to beat the computer in Pong!
RL agents Beyond DQN
To wrap up deep reinforcement learning, I’ll introduce the types of agents beyond DQN’s (Value, Model, Policy optimization and Imitation Learning). We'll implement Policy Gradient!
Advanced Actor Critic (A2C)
Today, we'll study a Reinforcement Learning method which we can call a 'hybrid method': Actor Critic. This algorithm combines the value optimization and policy optimization approaches.
Asynchronous Actor Critic (A3C)
In this tutorial I will provide an implementation of Asynchronous Advantage Actor-Critic (A3C) algorithm in Tensorflow and Keras. We will use it to solve a simple challenge in Pong environment!
Policy Optimization (PPO)
In this tutorial we'll dive on the understanding of the PPO architecture and we'll implement a Proximal Policy Optimization (PPO) agent that learns to play Pong-v0.