Sale!

Deep Reinforcement Learning in Action 1st Edition by Alexander Zai, Brandon Brown 9781638350507 1638350507

Name: Deep Reinforcement Learning in Action 1st Edition by Alexander Zai, Brandon Brown 9781638350507 1638350507
SKU: EB-50784358
Availability: InStock

Original price was: $50.00.Current price is: $25.00.

Deep Reinforcement Learning in Action 1st Edition Alexander Zai Brandon Brown Digital Instant Download

Author(s): Alexander Zai Brandon Brown

Edition: 1

File Details: EPUB, 8.85 MB

Year: 2020

Language: english

SKU: EB-50784358 Category: Uncategorized Tags: 1638350507, 9781638350507, Alexander Zai Brandon Brown, Deep Reinforcement, Learning in Action

Description

Deep Reinforcement Learning in Action 1st Edition Alexander Zai Brandon Brown – Ebook PDF Instant Download/Delivery, ISBN: 9781638350507, 1638350507

Product details:

ISBN 10:1638350507
ISBN 13:9781638350507
Author: Alexander Zai, Brandon Brown

Table contents:

Part 1. Foundations
Chapter 1. What is reinforcement learning?
1.1. The “deep” in deep reinforcement learning
1.2. Reinforcement learning
1.3. Dynamic programming versus Monte Carlo
1.4. The reinforcement learning framework
1.5. What can I do with reinforcement learning?
1.6. Why deep reinforcement learning?
1.7. Our didactic tool: String diagrams
1.8. What’s next?
Summary
Chapter 2. Modeling reinforcement learning problems: Markov decision processes
2.1. String diagrams and our teaching methods
2.2. Solving the multi-arm bandit
2.3. Applying bandits to optimize ad placements
2.4. Building networks with PyTorch
2.5. Solving contextual bandits
2.6. The Markov property
2.7. Predicting future rewards: Value and policy functions
Summary
Chapter 3. Predicting the best states and actions: Deep Q-networks
3.1. The Q function
3.2. Navigating with Q-learning
3.3. Preventing catastrophic forgetting: Experience replay
3.4. Improving stability with a target network
3.5. Review
Summary
Chapter 4. Learning to pick the best policy: Policy gradient methods
4.1. Policy function using neural networks
4.2. Reinforcing good actions: The policy gradient algorithm
4.3. Working with OpenAI Gym
4.4. The REINFORCE algorithm
Summary
Chapter 5. Tackling more complex problems with actor-critic methods
5.1. Combining the value and policy function
5.2. Distributed training
5.3. Advantage actor-critic
5.4. N-step actor-critic
Summary
Part 2. Above and beyond
Chapter 6. Alternative optimization methods: Evolutionary algorithms
6.1. A different approach to reinforcement learning
6.2. Reinforcement learning with evolution strategies
6.3. A genetic algorithm for CartPole
6.4. Pros and cons of evolutionary algorithms
6.5. Evolutionary algorithms as a scalable alternative
Summary
Chapter 7. Distributional DQN: Getting the full story
7.1. What’s wrong with Q-learning?
7.2. Probability and statistics revisited
7.3. The Bellman equation
7.4. Distributional Q-learning
7.5. Comparing probability distributions
7.6. Dist-DQN on simulated data
7.7. Using distributional Q-learning to play Freeway
Summary
Chapter 8. Curiosity-driven exploration
8.1. Tackling sparse rewards with predictive coding
8.2. Inverse dynamics prediction
8.3. Setting up Super Mario Bros.
8.4. Preprocessing and the Q-network
8.5. Setting up the Q-network and policy function
8.6. Intrinsic curiosity module
8.7. Alternative intrinsic reward mechanisms
Summary
Chapter 9. Multi-agent reinforcement learning
9.1. From one to many agents
9.2. Neighborhood Q-learning
9.3. The 1D Ising model
9.4. Mean field Q-learning and the 2D Ising model
9.5. Mixed cooperative-competitive games
Summary
Chapter 10. Interpretable reinforcement learning: Attention and relational models
10.1. Machine learning interpretability with attention and relational biases
10.2. Relational reasoning with attention
10.3. Implementing self-attention for MNIST
10.4. Multi-head attention and relational DQN
10.5. Double Q-learning
10.6. Training and attention visualization
Summary
Chapter 11. In conclusion: A review and roadmap
11.1. What did we learn?
11.2. The uncharted topics in deep reinforcement learning
11.3. The end