This course is all about the application of deep learning and neural networks to reinforcement learning. The combination of deep learning with reinforcement learning has led to AlphaGo beating a world champion in the strategy game Go, it has led to self-driving cars, and it has led to machines that can play video games at a superhuman level. Unlike supervised and unsupervised learning algorithms, reinforcement learning agents have an impetus—they want to reach a goal. In this course, you'll work with more complex environments, specifically, those provided by the OpenAI Gym.
- Access 52 lectures & 5 hours of content 24/7
- Extend your knowledge of temporal difference learning by looking at the TD Lambda algorithm
- Explore a special type of neural network called the RBF network
- Look at the policy gradient method
- Examine Deep Q-Learning
The Lazy Programmer is a data scientist, big data engineer, and full stack software engineer. For his master's thesis he worked on brain-computer interfaces using machine learning. These assist non-verbal and non-mobile persons to communicate with their family and caregivers.
He has worked in online advertising and digital media as both a data scientist and big data engineer, and built various high-throughput web services around said data. He has created new big data pipelines using Hadoop/Pig/MapReduce, and created machine learning models to predict click-through rate, news feed recommender systems using linear regression, Bayesian Bandits, and collaborative filtering and validated the results using A/B testing.
He has taught undergraduate and graduate students in data science, statistics, machine learning, algorithms, calculus, computer graphics, and physics for students attending universities such as Columbia University, NYU, Humber College, and The New School.
Details & Requirements
- Length of time users can access this course: lifetime
- Access options: web streaming, mobile streaming
- Certification of completion not included
- Redemption deadline: redeem your code within 30 days of purchase
- Experience level required: all levels, but knowledge of calculus, probability, object-oriented programming, Python, Numpy, linear regression, gradient descent, how to build a feedforward and convolutional neural network in Theano and TensorFlow, Markov Decision Processes, and how to implement Dynamic Programming, Monte Carlo, and Temporal Difference is expected
- All code for this course is available for download here, in the directory rl2
- Unredeemed licenses can be returned for store credit within 30 days of purchase. Once your license is redeemed, all sales are final.