SlideShare a Scribd company logo
Frontiers in Reinforcement Learning
Jie-Han Chen
NetDB, National Cheng Kung University
5/29, 2018 @ National Cheng Kung University, Taiwan
1
Outline
● Transfer Learning
● Curriculum learning
● Snubs in our lectures
● Questions
2
Transfer Learning
3
Transfer Learning
Transfer learning means learning the knowledge based on source domain, and
then transfer the knowledge to target domain.
Recently, Transfer Learning has become a hot research domain because it benefits
learning speed and learning performance.
4
Traditional Machine Learning
Task A, domain A
Model for task A
Learning Evaluate
Task B, domain B
Model for task B
Learning Evaluate
We train the model for each task from scratch.
Each model responsible for each task.
5
Transfer Learning
source task,
source domain
Model for task A
Learning
Model for task B
Knowledge
Transferring
Evaluate
targe task,
target domain
We train the model from source domain and
apply it to a different but related problem.
6
The advantages of transfer learning
● In some critical domains, there are not enough data for training from scratch.
We can apply transfer learning to help learning.
Images are from: https://becominghuman.ai/nvidia-and-the-gpu-contribution-to-the-ai-world-of-self-driving-cars-1f00e3212508
and Paper: A Survey on Deep Learning in Medical Image Analysis
7
Zero-shot learning / One-shot learning
● Zero-shot learning: learn the model from source domain, and apply it to target
domain directly without tuning in target domain.
●
● One-shot learning: learn the model from source domain, and finetune with little
samples in target domain.
8
Transfer features from pretrained model
In the previous work by J Yoshiski et al[1],
they surveyed how to transfer the
features in neural network.
9[1] How transferable are features in deep neural networks? [NIPS 2014]
Transfer features from pretrained model
10Transferred Layers
Transfer Module Knowledge
● Proposed by Coline Devin et al. (UCB)[2]
● learn module for specific task / robotic control
11
The image is from CS294, UCB
[2] Learning Modular Neural Network Policies for Multi-Task and Multi-Robot Transfer
Transfer Module Knowledge
12
The image is from CS294, UCB
Transfer Module Knowledge
13
The image is from CS294, UCB
task-related observation robot-related observation
Transfer Module Knowledge
14
The image is from CS294, UCB
Distill Multitask knowledge into single network
How to learn a multitask policy that can simultaneously perform many tasks?
● Actor-Mimic [3]
● Distral [4]
15
[3] Actor-Mimic: Deep Multitask and Transfer Reinforcement Learning
[4] Distral: Robust Multitask Reinforcement Learning
Actor-Mimic
● proposed by Emilio Parisotto, Jimmy Ba,
Ruslan Salakhutdinov.
● teach 1 NN by multiple experts
● use supervised learning to mimic
multi-task policy
16
Actor-Mimic
17
Distral
Distral: Distillation and Transfer Learning, proposed by DeepMind in 2017
● Distillation: combine multiple policies into one, for concurrent multitask
learning (accelerate all tasks through sharing) (from CS294)
18
Distral
19
Curriculum learning
● Proposed by Yoshua Bengio in 2009 [5]
● They emphasize the importance in the order of learning samples
○ Learn from the simple samples first, and then learn from much harder ones.
○ Dynamically expand the sample space from smaller and simpler to complicated target domain
● Help to converge to better local optimal, make us learn unlearnable task
20[5] Curriculum Learning, Yoshua Bengio et al.
Predict next word
● Corpus: Wikipedia
● Expand learning corpus periodically.
21
expand corpus
How to decide a good curriculum?
● noisy or not
● diversity
● similarty to our target problem or not
22
Self-Play
23
Self-play in AlphaGo Zero [6]
[6] Mastering the game of Go without human knowledge
Self-Play and Curriculum Learning
In Reinforcement Learning, self-play has succeeded in many thorny problem.
DeepMind use self-play to train AlphaGo Zero, and it needs less samples to reach
much higher performance than use supervised learning one before.
In self-play, the agent fights against itself. When it learns from scratch, the rival is
poor which is similar to use simpler samples to train the model. When the agent
grows stronger, the rival is also stronger too. Just like the samples and the problem
become more complicated and more difficult in Curriculum Learning.
24
Snubs in our lecture
1. Active Learning
2. Meta-Learning
3. Inverse RL
4. GAN and RL
5. Model-based RL
6. RL in NN Architecture Searching
25
Questions
Can we transfer multi-task policy into single NN to play a game with multitask?
(Contextual Policy)
26
How to learn AI?
● Find your own path to learn AI foundations, here is my path:
https://github.com/JIElite/Learning-AI
● Read diverse AI papers
● Polish your math skill
● Do much many experiments, and learn from practical experience
● Follow some AI researchers on Twitter, Reddit.
27

More Related Content

What's hot (20)

PPTX
Deep Reinforcement Learning
Usman Qayyum
 
PDF
Policy gradient
Jie-Han Chen
 
PPTX
Introduction: Asynchronous Methods for Deep Reinforcement Learning
Takashi Nagata
 
PPTX
Reinforcement Learning
Salem-Kabbani
 
PPTX
Intro to Deep Reinforcement Learning
Khaled Saleh
 
PDF
Deep Reinforcement Learning
MeetupDataScienceRoma
 
PPTX
Reinforcement Learning
DongHyun Kwak
 
PDF
Discrete sequential prediction of continuous actions for deep RL
Jie-Han Chen
 
PDF
Introduction of Deep Reinforcement Learning
NAVER Engineering
 
PDF
A brief overview of Reinforcement Learning applied to games
Thomas da Silva Paula
 
PDF
25 introduction reinforcement_learning
Andres Mendez-Vazquez
 
PDF
Deep Q-Learning
Nikolay Pavlov
 
PDF
Reinforcement learning
DongHyun Kwak
 
PPT
acai01-updated.ppt
butest
 
PPTX
An Introduction to Reinforcement Learning - The Doors to AGI
Anirban Santara
 
PPT
Reinforcement learning
Chandra Meena
 
PDF
Reinforcement Learning
Yigit UNALLAR
 
PDF
Temporal difference learning
Jie-Han Chen
 
PDF
Generalized Reinforcement Learning
Po-Hsiang (Barnett) Chiu
 
PDF
Reinforcement Learning Tutorial | Edureka
Edureka!
 
Deep Reinforcement Learning
Usman Qayyum
 
Policy gradient
Jie-Han Chen
 
Introduction: Asynchronous Methods for Deep Reinforcement Learning
Takashi Nagata
 
Reinforcement Learning
Salem-Kabbani
 
Intro to Deep Reinforcement Learning
Khaled Saleh
 
Deep Reinforcement Learning
MeetupDataScienceRoma
 
Reinforcement Learning
DongHyun Kwak
 
Discrete sequential prediction of continuous actions for deep RL
Jie-Han Chen
 
Introduction of Deep Reinforcement Learning
NAVER Engineering
 
A brief overview of Reinforcement Learning applied to games
Thomas da Silva Paula
 
25 introduction reinforcement_learning
Andres Mendez-Vazquez
 
Deep Q-Learning
Nikolay Pavlov
 
Reinforcement learning
DongHyun Kwak
 
acai01-updated.ppt
butest
 
An Introduction to Reinforcement Learning - The Doors to AGI
Anirban Santara
 
Reinforcement learning
Chandra Meena
 
Reinforcement Learning
Yigit UNALLAR
 
Temporal difference learning
Jie-Han Chen
 
Generalized Reinforcement Learning
Po-Hsiang (Barnett) Chiu
 
Reinforcement Learning Tutorial | Edureka
Edureka!
 

Similar to Frontier in reinforcement learning (20)

PDF
Lecture 11 - Advance Learning Techniques
Maninda Edirisooriya
 
PPTX
MaLAI_Hyderabad presentation
Gurram Poorna Prudhvi
 
PDF
MILA DL & RL summer school highlights
Natalia Díaz Rodríguez
 
PPTX
Deep Learning: Towards General Artificial Intelligence
Rukshan Batuwita
 
PPTX
Transfer learning with real world applications in deep learning
Rahat Yasir
 
PDF
Transfer Learning and Domain Adaptation - Ramon Morros - UPC Barcelona 2018
Universitat Politècnica de Catalunya
 
PDF
Transfer Learning and Domain Adaptation (DLAI D5L2 2017 UPC Deep Learning for...
Universitat Politècnica de Catalunya
 
PDF
NTU DBME5028 Week8 Transfer Learning
Sean Yu
 
PPTX
What Deep Learning Means for Artificial Intelligence
Jonathan Mugan
 
PDF
Transfer Learning -- The Next Frontier for Machine Learning
Sebastian Ruder
 
PDF
Multi-Task Learning With Deep Neural Networks
AbhishekBais8
 
PDF
Transfer Learning (D2L4 Insight@DCU Machine Learning Workshop 2017)
Universitat Politècnica de Catalunya
 
PDF
Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020
Universitat Politècnica de Catalunya
 
PPTX
19.pptx
GauravGautam216125
 
PDF
ODSC East: Effective Transfer Learning for NLP
indico data
 
PPTX
semi supervised Learning and Reinforcement learning (1).pptx
Dr.Shweta
 
PPTX
Deep Learning Jump Start
Michele Toni
 
PDF
Mastering Advanced Deep Learning Techniques | IABAC
IABAC
 
PPTX
Building a deep learning ai.pptx
Daniel Slater
 
PDF
[244]로봇이 현실 세계에 대해 학습하도록 만들기
NAVER D2
 
Lecture 11 - Advance Learning Techniques
Maninda Edirisooriya
 
MaLAI_Hyderabad presentation
Gurram Poorna Prudhvi
 
MILA DL & RL summer school highlights
Natalia Díaz Rodríguez
 
Deep Learning: Towards General Artificial Intelligence
Rukshan Batuwita
 
Transfer learning with real world applications in deep learning
Rahat Yasir
 
Transfer Learning and Domain Adaptation - Ramon Morros - UPC Barcelona 2018
Universitat Politècnica de Catalunya
 
Transfer Learning and Domain Adaptation (DLAI D5L2 2017 UPC Deep Learning for...
Universitat Politècnica de Catalunya
 
NTU DBME5028 Week8 Transfer Learning
Sean Yu
 
What Deep Learning Means for Artificial Intelligence
Jonathan Mugan
 
Transfer Learning -- The Next Frontier for Machine Learning
Sebastian Ruder
 
Multi-Task Learning With Deep Neural Networks
AbhishekBais8
 
Transfer Learning (D2L4 Insight@DCU Machine Learning Workshop 2017)
Universitat Politècnica de Catalunya
 
Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020
Universitat Politècnica de Catalunya
 
ODSC East: Effective Transfer Learning for NLP
indico data
 
semi supervised Learning and Reinforcement learning (1).pptx
Dr.Shweta
 
Deep Learning Jump Start
Michele Toni
 
Mastering Advanced Deep Learning Techniques | IABAC
IABAC
 
Building a deep learning ai.pptx
Daniel Slater
 
[244]로봇이 현실 세계에 대해 학습하도록 만들기
NAVER D2
 
Ad

More from Jie-Han Chen (6)

PDF
Temporal difference learning
Jie-Han Chen
 
PDF
Deep reinforcement learning
Jie-Han Chen
 
PDF
Markov decision process
Jie-Han Chen
 
PDF
BiCNet presentation (multi-agent reinforcement learning)
Jie-Han Chen
 
PDF
Data science-toolchain
Jie-Han Chen
 
PDF
The artofreadablecode
Jie-Han Chen
 
Temporal difference learning
Jie-Han Chen
 
Deep reinforcement learning
Jie-Han Chen
 
Markov decision process
Jie-Han Chen
 
BiCNet presentation (multi-agent reinforcement learning)
Jie-Han Chen
 
Data science-toolchain
Jie-Han Chen
 
The artofreadablecode
Jie-Han Chen
 
Ad

Recently uploaded (20)

PPTX
INTESTINALPARASITES OR WORM INFESTATIONS.pptx
PRADEEP ABOTHU
 
PPTX
Command Palatte in Odoo 18.1 Spreadsheet - Odoo Slides
Celine George
 
PDF
Module 2: Public Health History [Tutorial Slides]
JonathanHallett4
 
PPTX
K-Circle-Weekly-Quiz12121212-May2025.pptx
Pankaj Rodey
 
PDF
The-Invisible-Living-World-Beyond-Our-Naked-Eye chapter 2.pdf/8th science cur...
Sandeep Swamy
 
PDF
The Minister of Tourism, Culture and Creative Arts, Abla Dzifa Gomashie has e...
nservice241
 
PPTX
LDP-2 UNIT 4 Presentation for practical.pptx
abhaypanchal2525
 
PPTX
Rules and Regulations of Madhya Pradesh Library Part-I
SantoshKumarKori2
 
PDF
My Thoughts On Q&A- A Novel By Vikas Swarup
Niharika
 
PPTX
20250924 Navigating the Future: How to tell the difference between an emergen...
McGuinness Institute
 
PDF
Virat Kohli- the Pride of Indian cricket
kushpar147
 
PDF
John Keats introduction and list of his important works
vatsalacpr
 
PPTX
Applied-Statistics-1.pptx hardiba zalaaa
hardizala899
 
PPTX
Cleaning Validation Ppt Pharmaceutical validation
Ms. Ashatai Patil
 
PPT
DRUGS USED IN THERAPY OF SHOCK, Shock Therapy, Treatment or management of shock
Rajshri Ghogare
 
PPTX
Sonnet 130_ My Mistress’ Eyes Are Nothing Like the Sun By William Shakespear...
DhatriParmar
 
DOCX
pgdei-UNIT -V Neurological Disorders & developmental disabilities
JELLA VISHNU DURGA PRASAD
 
PPTX
Cybersecurity: How to Protect your Digital World from Hackers
vaidikpanda4
 
PPTX
Unlock the Power of Cursor AI: MuleSoft Integrations
Veera Pallapu
 
PPTX
Introduction to Probability(basic) .pptx
purohitanuj034
 
INTESTINALPARASITES OR WORM INFESTATIONS.pptx
PRADEEP ABOTHU
 
Command Palatte in Odoo 18.1 Spreadsheet - Odoo Slides
Celine George
 
Module 2: Public Health History [Tutorial Slides]
JonathanHallett4
 
K-Circle-Weekly-Quiz12121212-May2025.pptx
Pankaj Rodey
 
The-Invisible-Living-World-Beyond-Our-Naked-Eye chapter 2.pdf/8th science cur...
Sandeep Swamy
 
The Minister of Tourism, Culture and Creative Arts, Abla Dzifa Gomashie has e...
nservice241
 
LDP-2 UNIT 4 Presentation for practical.pptx
abhaypanchal2525
 
Rules and Regulations of Madhya Pradesh Library Part-I
SantoshKumarKori2
 
My Thoughts On Q&A- A Novel By Vikas Swarup
Niharika
 
20250924 Navigating the Future: How to tell the difference between an emergen...
McGuinness Institute
 
Virat Kohli- the Pride of Indian cricket
kushpar147
 
John Keats introduction and list of his important works
vatsalacpr
 
Applied-Statistics-1.pptx hardiba zalaaa
hardizala899
 
Cleaning Validation Ppt Pharmaceutical validation
Ms. Ashatai Patil
 
DRUGS USED IN THERAPY OF SHOCK, Shock Therapy, Treatment or management of shock
Rajshri Ghogare
 
Sonnet 130_ My Mistress’ Eyes Are Nothing Like the Sun By William Shakespear...
DhatriParmar
 
pgdei-UNIT -V Neurological Disorders & developmental disabilities
JELLA VISHNU DURGA PRASAD
 
Cybersecurity: How to Protect your Digital World from Hackers
vaidikpanda4
 
Unlock the Power of Cursor AI: MuleSoft Integrations
Veera Pallapu
 
Introduction to Probability(basic) .pptx
purohitanuj034
 

Frontier in reinforcement learning

  • 1. Frontiers in Reinforcement Learning Jie-Han Chen NetDB, National Cheng Kung University 5/29, 2018 @ National Cheng Kung University, Taiwan 1
  • 2. Outline ● Transfer Learning ● Curriculum learning ● Snubs in our lectures ● Questions 2
  • 4. Transfer Learning Transfer learning means learning the knowledge based on source domain, and then transfer the knowledge to target domain. Recently, Transfer Learning has become a hot research domain because it benefits learning speed and learning performance. 4
  • 5. Traditional Machine Learning Task A, domain A Model for task A Learning Evaluate Task B, domain B Model for task B Learning Evaluate We train the model for each task from scratch. Each model responsible for each task. 5
  • 6. Transfer Learning source task, source domain Model for task A Learning Model for task B Knowledge Transferring Evaluate targe task, target domain We train the model from source domain and apply it to a different but related problem. 6
  • 7. The advantages of transfer learning ● In some critical domains, there are not enough data for training from scratch. We can apply transfer learning to help learning. Images are from: https://becominghuman.ai/nvidia-and-the-gpu-contribution-to-the-ai-world-of-self-driving-cars-1f00e3212508 and Paper: A Survey on Deep Learning in Medical Image Analysis 7
  • 8. Zero-shot learning / One-shot learning ● Zero-shot learning: learn the model from source domain, and apply it to target domain directly without tuning in target domain. ● ● One-shot learning: learn the model from source domain, and finetune with little samples in target domain. 8
  • 9. Transfer features from pretrained model In the previous work by J Yoshiski et al[1], they surveyed how to transfer the features in neural network. 9[1] How transferable are features in deep neural networks? [NIPS 2014]
  • 10. Transfer features from pretrained model 10Transferred Layers
  • 11. Transfer Module Knowledge ● Proposed by Coline Devin et al. (UCB)[2] ● learn module for specific task / robotic control 11 The image is from CS294, UCB [2] Learning Modular Neural Network Policies for Multi-Task and Multi-Robot Transfer
  • 12. Transfer Module Knowledge 12 The image is from CS294, UCB
  • 13. Transfer Module Knowledge 13 The image is from CS294, UCB task-related observation robot-related observation
  • 14. Transfer Module Knowledge 14 The image is from CS294, UCB
  • 15. Distill Multitask knowledge into single network How to learn a multitask policy that can simultaneously perform many tasks? ● Actor-Mimic [3] ● Distral [4] 15 [3] Actor-Mimic: Deep Multitask and Transfer Reinforcement Learning [4] Distral: Robust Multitask Reinforcement Learning
  • 16. Actor-Mimic ● proposed by Emilio Parisotto, Jimmy Ba, Ruslan Salakhutdinov. ● teach 1 NN by multiple experts ● use supervised learning to mimic multi-task policy 16
  • 18. Distral Distral: Distillation and Transfer Learning, proposed by DeepMind in 2017 ● Distillation: combine multiple policies into one, for concurrent multitask learning (accelerate all tasks through sharing) (from CS294) 18
  • 20. Curriculum learning ● Proposed by Yoshua Bengio in 2009 [5] ● They emphasize the importance in the order of learning samples ○ Learn from the simple samples first, and then learn from much harder ones. ○ Dynamically expand the sample space from smaller and simpler to complicated target domain ● Help to converge to better local optimal, make us learn unlearnable task 20[5] Curriculum Learning, Yoshua Bengio et al.
  • 21. Predict next word ● Corpus: Wikipedia ● Expand learning corpus periodically. 21 expand corpus
  • 22. How to decide a good curriculum? ● noisy or not ● diversity ● similarty to our target problem or not 22
  • 23. Self-Play 23 Self-play in AlphaGo Zero [6] [6] Mastering the game of Go without human knowledge
  • 24. Self-Play and Curriculum Learning In Reinforcement Learning, self-play has succeeded in many thorny problem. DeepMind use self-play to train AlphaGo Zero, and it needs less samples to reach much higher performance than use supervised learning one before. In self-play, the agent fights against itself. When it learns from scratch, the rival is poor which is similar to use simpler samples to train the model. When the agent grows stronger, the rival is also stronger too. Just like the samples and the problem become more complicated and more difficult in Curriculum Learning. 24
  • 25. Snubs in our lecture 1. Active Learning 2. Meta-Learning 3. Inverse RL 4. GAN and RL 5. Model-based RL 6. RL in NN Architecture Searching 25
  • 26. Questions Can we transfer multi-task policy into single NN to play a game with multitask? (Contextual Policy) 26
  • 27. How to learn AI? ● Find your own path to learn AI foundations, here is my path: https://github.com/JIElite/Learning-AI ● Read diverse AI papers ● Polish your math skill ● Do much many experiments, and learn from practical experience ● Follow some AI researchers on Twitter, Reddit. 27