SlideShare a Scribd company logo
2
Most read
3
Most read
Gym & Universe
Ashish Kumar
LinkedIn: https://www.linkedin.com/in/ashkmr1
Twitter : @ashish_fagna
e-mail : ashish.fagna@gmail.com
OpenAI - Introduction
● OpenAI is a non-profit artificial
intelligence (AI) research company.
● It aims to promote and develop
friendly AI in a way to benefit
humanity as a whole.
● It aims to "freely collaborate" by
making its patents and research open
to the public.
2
Latest in News
● OpenAI Five vs Dota 2
● Event was streamed online
● OpenAI Five is a set of five neural networks
3
Source: https://medium.com/deep-math-machine-learning-ai/different-types-of-machine-learning-and-their-types-34760b9128a2
4
ReInforcement Learning
● One of the most important type of Machine Learning,
● An agent learns how to behave in a environment by performing actions and
seeing the results.
5
ReInforcement Learning
There are two basic concepts in reinforcement learning:
1. Environment (namely, the outside world) and
2. Agent (namely, the algorithm you are writing).
The agent sends actions to the environment, and the environment replies with
observations and rewards (that is, a score).
6
Example : ReInforcement Learning
Imagine you’re a child in a living room.
Action1 : You see a fireplace, and you approach it. It’s warm
(Positive Reward +1).
Action 2: But when you try to touch the fire. It burns your hand
(Negative reward -1).
Learning 1 : fire is positive when you are a sufficient distance
away, because it produces warmth.
Learning 2 : But getting too close to it and you will be burned.
Image Source : https://medium.freecodecamp.org/an-introduction-to-reinforcement-learning-4339519de419
That’s how humans learn, through interaction.
Reinforcement Learning is just a computational approach of learning from action.
7
Reinforcement Learning
Actions influence the state, which determines reward.
Image Source : https://medium.freecodecamp.org/an-introduction-to-reinforcement-learning-4339519de419
8
Reinforcement Learning Process (Super Mario)
Let’s imagine an agent learning to play Super Mario.
The Reinforcement Learning (RL) process can be modeled as a loop that works
like this:
● Our Agent receives state S0 from the Environment (In our case we receive
the first frame of our game (state) from Super Mario (environment))
● Based on that state S0, agent takes an action A0 (our agent will move right)
● Environment transitions to a new state S1 (new frame)
● Environment gives some reward R1 to the agent (not dead: +1)
This RL loop outputs a sequence of state, action and reward.
9
The Reinforcement Learning process
● The goal of the agent is to maximize the expected cumulative reward.
● By running more and more loops, the agent will learn to play better and
better.
10
OpenAI Gym
● Gym is a toolkit for Researching
(developing and comparing)
reinforcement learning algorithms.
● It supports teaching agents everything
from walking to playing games like Pong
or Pinball.
● Gym Envs:
https://gym.openai.com/envs/#mujoco
Ant-v2
Make a 3D four-legged robot walk. 11
OpenAI Universe
● Platform for measuring and training an AGI across games, websites and other
applications.
● Makes it possible for any existing program to become an OpenAI Gym
environment, without needing special access to the program's internals,
source code, or APIs.
● It does this by packaging the program into a Docker container, and presenting
the AI with the same interface a human uses: sending keyboard and mouse
events, and receiving screen pixels.
● Contains over 1,000 environments in which an AI agent can take actions and
gather observations. 12
Command: Start Docker Container via Conda
● Conda is an open source package management system and environment
management system that runs on Windows, macOS and Linux.
● Conda quickly installs, runs and updates packages and their dependencies.
● Conda easily creates, saves, loads and switches between environments on
your local computer.
Command:
conda create --name universe-starter-agent python=3.5
source activate universe-starter-agent
13
OpenAI Universe Demo
14
Gym vs Universe
● OpenAI Universe is like a much bigger OpenAI Gym.
● OpenAI Gym’s got some basic tasks, like pole balancing, and pendulum
uprighting, and some more difficult ones like basic Atari games like Space
Invaders.
● like an enclosed world, or a “gym” to exercise and develop RL algorithms.
● OpenAI Universe has a much wider variety of tasks, and is more involved in
giving RL networks/algorithms the ability to interact with the real world:
playing games, using an actual (virtual) keyboard and mouse to interact with
buttons and sliders on webpages, etc.
● Universe is based on Gym
15
Use Cases
Environments for doing various tasks, like
● Sending an email,
● Doing some mouse clicking, keyboard events,
● More and more environments are being added
16
Thank You
Ashish Kumar
LinkedIn: https://www.linkedin.com/in/ashkmr1
Twitter : @ashish_fagna
E-mail: ashish.fagna@gmail.com
17

More Related Content

PDF
Getting Started with ChatGPT.pdf
Manish Chopra
 
PPTX
How does ChatGPT work: an Information Retrieval perspective
Sease
 
PPTX
Chatgpt.pptx
ShubhamJangali
 
PPTX
Jio final ppt
Raunak Biswas
 
PPT
Organizational change & development
vanyasingla1
 
PPTX
Open ai openpower
Ganesan Narayanasamy
 
PDF
ViT (Vision Transformer) Review [CDM]
Dongmin Choi
 
PPTX
What Is GPT-3 And Why Is It Revolutionizing Artificial Intelligence?
Bernard Marr
 
Getting Started with ChatGPT.pdf
Manish Chopra
 
How does ChatGPT work: an Information Retrieval perspective
Sease
 
Chatgpt.pptx
ShubhamJangali
 
Jio final ppt
Raunak Biswas
 
Organizational change & development
vanyasingla1
 
Open ai openpower
Ganesan Narayanasamy
 
ViT (Vision Transformer) Review [CDM]
Dongmin Choi
 
What Is GPT-3 And Why Is It Revolutionizing Artificial Intelligence?
Bernard Marr
 

What's hot (20)

PDF
Ilya Sutskever at AI Frontiers : Progress towards the OpenAI mission
AI Frontiers
 
PDF
OpenAI’s GPT 3 Language Model - guest Steve Omohundro
Numenta
 
PDF
Generative-AI-in-enterprise-20230615.pdf
Liming Zhu
 
PDF
An Introduction to Generative AI
Cori Faklaris
 
PPTX
OpenAI-Copilot-ChatGPT.pptx
Udaiappa Ramachandran
 
PPTX
Generative AI, WiDS 2023.pptx
Colleen Farrelly
 
PDF
AI and ML Series - Introduction to Generative AI and LLMs - Session 1
DianaGray10
 
PDF
Deep Learning - The Past, Present and Future of Artificial Intelligence
Lukas Masuch
 
PDF
Intro to LLMs
Loic Merckel
 
PDF
Everything to know about ChatGPT
Knoldus Inc.
 
PDF
Let's talk about GPT: A crash course in Generative AI for researchers
Steven Van Vaerenbergh
 
PDF
Large Language Models Bootcamp
Data Science Dojo
 
PDF
Reinforcement Learning using OpenAI Gym
Muhammad Aleem Siddiqui
 
PDF
Generative AI
All Things Open
 
PPTX
Artificial Intelligence Course | AI Tutorial For Beginners | Artificial Intel...
Simplilearn
 
PDF
ChatGPT 101 - Vancouver ChatGPT Experts
Ali Tavanayan
 
PPTX
The Future of AI is Generative not Discriminative 5/26/2021
Steve Omohundro
 
PPTX
Deep Learning Applications | Deep Learning Applications In Real Life | Deep l...
Simplilearn
 
PPTX
A brief primer on OpenAI's GPT-3
Ishan Jain
 
PDF
Large Language Models - Chat AI.pdf
David Rostcheck
 
Ilya Sutskever at AI Frontiers : Progress towards the OpenAI mission
AI Frontiers
 
OpenAI’s GPT 3 Language Model - guest Steve Omohundro
Numenta
 
Generative-AI-in-enterprise-20230615.pdf
Liming Zhu
 
An Introduction to Generative AI
Cori Faklaris
 
OpenAI-Copilot-ChatGPT.pptx
Udaiappa Ramachandran
 
Generative AI, WiDS 2023.pptx
Colleen Farrelly
 
AI and ML Series - Introduction to Generative AI and LLMs - Session 1
DianaGray10
 
Deep Learning - The Past, Present and Future of Artificial Intelligence
Lukas Masuch
 
Intro to LLMs
Loic Merckel
 
Everything to know about ChatGPT
Knoldus Inc.
 
Let's talk about GPT: A crash course in Generative AI for researchers
Steven Van Vaerenbergh
 
Large Language Models Bootcamp
Data Science Dojo
 
Reinforcement Learning using OpenAI Gym
Muhammad Aleem Siddiqui
 
Generative AI
All Things Open
 
Artificial Intelligence Course | AI Tutorial For Beginners | Artificial Intel...
Simplilearn
 
ChatGPT 101 - Vancouver ChatGPT Experts
Ali Tavanayan
 
The Future of AI is Generative not Discriminative 5/26/2021
Steve Omohundro
 
Deep Learning Applications | Deep Learning Applications In Real Life | Deep l...
Simplilearn
 
A brief primer on OpenAI's GPT-3
Ishan Jain
 
Large Language Models - Chat AI.pdf
David Rostcheck
 
Ad

Similar to OpenAI Gym & Universe (20)

PDF
Bringing Machine Learning to Unity by Arthur Juliani from Unity
Bill Liu
 
PPTX
Reinforcement Learning – a Rewards Based Approach to Machine Learning - Marko...
Marko Lohert
 
PDF
Deep reinforcement learning&Robotics
湯米吳 Tommy Wu
 
DOCX
What is goap, and why is it not already mainstream
Aakash Chotrani
 
PPTX
Machine Learning in Unity - How to give your game AI a real brain
DevGAMM Conference
 
PPTX
Ciro Continisio - Implementing Machine Learning the Unity way - Codemotion Mi...
Codemotion
 
PPTX
OpenAI_Company.pptx
NermineChennaoui1
 
PDF
AbadIA: the abbey of the crime AI - GDG Cloud London 2018
Juantomás García Molina
 
PPTX
AI INTRODUCTION.pptx,INFORMATION TECHNOLOGY
santoshverma90
 
PPTX
Biological organism simulation using procedural growth "Organimo 1.0"
Devyani Singh
 
PDF
ARTIFICIAL INTELLIGENCEr.pdf
ssusere55750
 
PDF
ARTIFICIAL INTELLIGENCEr.pdf
Muhammad Sohail
 
PDF
[244]로봇이 현실 세계에 대해 학습하도록 만들기
NAVER D2
 
DOCX
Hrms industrial training report
Nitesh Dubey
 
PPTX
2013 Gartner ITO Conference - IT Ops Gamification with ITPA
ckindiger
 
PDF
Functional Requirements Of System Requirements
Laura Arrigo
 
PDF
Is Production RL at a tipping point?
M Waleed Kadous
 
PDF
How to generate game character behaviors using AI and ML - Unite Copenhagen
Unity Technologies
 
PPTX
Building a deep learning ai.pptx
Daniel Slater
 
PPTX
Artificial intelligence presentation for students
kavyanallana1508
 
Bringing Machine Learning to Unity by Arthur Juliani from Unity
Bill Liu
 
Reinforcement Learning – a Rewards Based Approach to Machine Learning - Marko...
Marko Lohert
 
Deep reinforcement learning&Robotics
湯米吳 Tommy Wu
 
What is goap, and why is it not already mainstream
Aakash Chotrani
 
Machine Learning in Unity - How to give your game AI a real brain
DevGAMM Conference
 
Ciro Continisio - Implementing Machine Learning the Unity way - Codemotion Mi...
Codemotion
 
OpenAI_Company.pptx
NermineChennaoui1
 
AbadIA: the abbey of the crime AI - GDG Cloud London 2018
Juantomás García Molina
 
AI INTRODUCTION.pptx,INFORMATION TECHNOLOGY
santoshverma90
 
Biological organism simulation using procedural growth "Organimo 1.0"
Devyani Singh
 
ARTIFICIAL INTELLIGENCEr.pdf
ssusere55750
 
ARTIFICIAL INTELLIGENCEr.pdf
Muhammad Sohail
 
[244]로봇이 현실 세계에 대해 학습하도록 만들기
NAVER D2
 
Hrms industrial training report
Nitesh Dubey
 
2013 Gartner ITO Conference - IT Ops Gamification with ITPA
ckindiger
 
Functional Requirements Of System Requirements
Laura Arrigo
 
Is Production RL at a tipping point?
M Waleed Kadous
 
How to generate game character behaviors using AI and ML - Unite Copenhagen
Unity Technologies
 
Building a deep learning ai.pptx
Daniel Slater
 
Artificial intelligence presentation for students
kavyanallana1508
 
Ad

More from Entrepreneur / Startup (13)

PDF
R-FCN : object detection via region-based fully convolutional networks
Entrepreneur / Startup
 
PPTX
You only look once (YOLO) : unified real time object detection
Entrepreneur / Startup
 
PPTX
Machine Learning Algorithms in Enterprise Applications
Entrepreneur / Startup
 
PPTX
Build a Neural Network for ITSM with TensorFlow
Entrepreneur / Startup
 
PPTX
Understanding Autoencoder (Deep Learning Book, Chapter 14)
Entrepreneur / Startup
 
PPTX
Build an AI based virtual agent
Entrepreneur / Startup
 
PPTX
Building Bots Using IBM Watson
Entrepreneur / Startup
 
PDF
Building chat bots using ai platforms (wit.ai or api.ai) in nodejs
Entrepreneur / Startup
 
PPTX
Building mobile apps using meteorJS
Entrepreneur / Startup
 
PPTX
Building iOS app using meteor
Entrepreneur / Startup
 
PPTX
Understanding angular meteor
Entrepreneur / Startup
 
PPTX
Introducing ElasticSearch - Ashish
Entrepreneur / Startup
 
PPTX
Meteor Introduction - Ashish
Entrepreneur / Startup
 
R-FCN : object detection via region-based fully convolutional networks
Entrepreneur / Startup
 
You only look once (YOLO) : unified real time object detection
Entrepreneur / Startup
 
Machine Learning Algorithms in Enterprise Applications
Entrepreneur / Startup
 
Build a Neural Network for ITSM with TensorFlow
Entrepreneur / Startup
 
Understanding Autoencoder (Deep Learning Book, Chapter 14)
Entrepreneur / Startup
 
Build an AI based virtual agent
Entrepreneur / Startup
 
Building Bots Using IBM Watson
Entrepreneur / Startup
 
Building chat bots using ai platforms (wit.ai or api.ai) in nodejs
Entrepreneur / Startup
 
Building mobile apps using meteorJS
Entrepreneur / Startup
 
Building iOS app using meteor
Entrepreneur / Startup
 
Understanding angular meteor
Entrepreneur / Startup
 
Introducing ElasticSearch - Ashish
Entrepreneur / Startup
 
Meteor Introduction - Ashish
Entrepreneur / Startup
 

Recently uploaded (20)

PDF
Security features in Dell, HP, and Lenovo PC systems: A research-based compar...
Principled Technologies
 
PDF
Presentation about Hardware and Software in Computer
snehamodhawadiya
 
PDF
NewMind AI Weekly Chronicles - July'25 - Week IV
NewMind AI
 
PPTX
AI in Daily Life: How Artificial Intelligence Helps Us Every Day
vanshrpatil7
 
PDF
Orbitly Pitch Deck|A Mission-Driven Platform for Side Project Collaboration (...
zz41354899
 
PDF
MASTERDECK GRAPHSUMMIT SYDNEY (Public).pdf
Neo4j
 
PPTX
Dev Dives: Automate, test, and deploy in one place—with Unified Developer Exp...
AndreeaTom
 
PDF
Google I/O Extended 2025 Baku - all ppts
HusseinMalikMammadli
 
PPTX
The-Ethical-Hackers-Imperative-Safeguarding-the-Digital-Frontier.pptx
sujalchauhan1305
 
PDF
Get More from Fiori Automation - What’s New, What Works, and What’s Next.pdf
Precisely
 
PDF
The Future of Artificial Intelligence (AI)
Mukul
 
PDF
SparkLabs Primer on Artificial Intelligence 2025
SparkLabs Group
 
PPTX
IT Runs Better with ThousandEyes AI-driven Assurance
ThousandEyes
 
PPTX
What-is-the-World-Wide-Web -- Introduction
tonifi9488
 
PDF
A Strategic Analysis of the MVNO Wave in Emerging Markets.pdf
IPLOOK Networks
 
PDF
Brief History of Internet - Early Days of Internet
sutharharshit158
 
PDF
AI-Cloud-Business-Management-Platforms-The-Key-to-Efficiency-Growth.pdf
Artjoker Software Development Company
 
PDF
Automating ArcGIS Content Discovery with FME: A Real World Use Case
Safe Software
 
PDF
Economic Impact of Data Centres to the Malaysian Economy
flintglobalapac
 
PDF
Make GenAI investments go further with the Dell AI Factory
Principled Technologies
 
Security features in Dell, HP, and Lenovo PC systems: A research-based compar...
Principled Technologies
 
Presentation about Hardware and Software in Computer
snehamodhawadiya
 
NewMind AI Weekly Chronicles - July'25 - Week IV
NewMind AI
 
AI in Daily Life: How Artificial Intelligence Helps Us Every Day
vanshrpatil7
 
Orbitly Pitch Deck|A Mission-Driven Platform for Side Project Collaboration (...
zz41354899
 
MASTERDECK GRAPHSUMMIT SYDNEY (Public).pdf
Neo4j
 
Dev Dives: Automate, test, and deploy in one place—with Unified Developer Exp...
AndreeaTom
 
Google I/O Extended 2025 Baku - all ppts
HusseinMalikMammadli
 
The-Ethical-Hackers-Imperative-Safeguarding-the-Digital-Frontier.pptx
sujalchauhan1305
 
Get More from Fiori Automation - What’s New, What Works, and What’s Next.pdf
Precisely
 
The Future of Artificial Intelligence (AI)
Mukul
 
SparkLabs Primer on Artificial Intelligence 2025
SparkLabs Group
 
IT Runs Better with ThousandEyes AI-driven Assurance
ThousandEyes
 
What-is-the-World-Wide-Web -- Introduction
tonifi9488
 
A Strategic Analysis of the MVNO Wave in Emerging Markets.pdf
IPLOOK Networks
 
Brief History of Internet - Early Days of Internet
sutharharshit158
 
AI-Cloud-Business-Management-Platforms-The-Key-to-Efficiency-Growth.pdf
Artjoker Software Development Company
 
Automating ArcGIS Content Discovery with FME: A Real World Use Case
Safe Software
 
Economic Impact of Data Centres to the Malaysian Economy
flintglobalapac
 
Make GenAI investments go further with the Dell AI Factory
Principled Technologies
 

OpenAI Gym & Universe

  • 1. Gym & Universe Ashish Kumar LinkedIn: https://www.linkedin.com/in/ashkmr1 Twitter : @ashish_fagna e-mail : [email protected]
  • 2. OpenAI - Introduction ● OpenAI is a non-profit artificial intelligence (AI) research company. ● It aims to promote and develop friendly AI in a way to benefit humanity as a whole. ● It aims to "freely collaborate" by making its patents and research open to the public. 2
  • 3. Latest in News ● OpenAI Five vs Dota 2 ● Event was streamed online ● OpenAI Five is a set of five neural networks 3
  • 5. ReInforcement Learning ● One of the most important type of Machine Learning, ● An agent learns how to behave in a environment by performing actions and seeing the results. 5
  • 6. ReInforcement Learning There are two basic concepts in reinforcement learning: 1. Environment (namely, the outside world) and 2. Agent (namely, the algorithm you are writing). The agent sends actions to the environment, and the environment replies with observations and rewards (that is, a score). 6
  • 7. Example : ReInforcement Learning Imagine you’re a child in a living room. Action1 : You see a fireplace, and you approach it. It’s warm (Positive Reward +1). Action 2: But when you try to touch the fire. It burns your hand (Negative reward -1). Learning 1 : fire is positive when you are a sufficient distance away, because it produces warmth. Learning 2 : But getting too close to it and you will be burned. Image Source : https://medium.freecodecamp.org/an-introduction-to-reinforcement-learning-4339519de419 That’s how humans learn, through interaction. Reinforcement Learning is just a computational approach of learning from action. 7
  • 8. Reinforcement Learning Actions influence the state, which determines reward. Image Source : https://medium.freecodecamp.org/an-introduction-to-reinforcement-learning-4339519de419 8
  • 9. Reinforcement Learning Process (Super Mario) Let’s imagine an agent learning to play Super Mario. The Reinforcement Learning (RL) process can be modeled as a loop that works like this: ● Our Agent receives state S0 from the Environment (In our case we receive the first frame of our game (state) from Super Mario (environment)) ● Based on that state S0, agent takes an action A0 (our agent will move right) ● Environment transitions to a new state S1 (new frame) ● Environment gives some reward R1 to the agent (not dead: +1) This RL loop outputs a sequence of state, action and reward. 9
  • 10. The Reinforcement Learning process ● The goal of the agent is to maximize the expected cumulative reward. ● By running more and more loops, the agent will learn to play better and better. 10
  • 11. OpenAI Gym ● Gym is a toolkit for Researching (developing and comparing) reinforcement learning algorithms. ● It supports teaching agents everything from walking to playing games like Pong or Pinball. ● Gym Envs: https://gym.openai.com/envs/#mujoco Ant-v2 Make a 3D four-legged robot walk. 11
  • 12. OpenAI Universe ● Platform for measuring and training an AGI across games, websites and other applications. ● Makes it possible for any existing program to become an OpenAI Gym environment, without needing special access to the program's internals, source code, or APIs. ● It does this by packaging the program into a Docker container, and presenting the AI with the same interface a human uses: sending keyboard and mouse events, and receiving screen pixels. ● Contains over 1,000 environments in which an AI agent can take actions and gather observations. 12
  • 13. Command: Start Docker Container via Conda ● Conda is an open source package management system and environment management system that runs on Windows, macOS and Linux. ● Conda quickly installs, runs and updates packages and their dependencies. ● Conda easily creates, saves, loads and switches between environments on your local computer. Command: conda create --name universe-starter-agent python=3.5 source activate universe-starter-agent 13
  • 15. Gym vs Universe ● OpenAI Universe is like a much bigger OpenAI Gym. ● OpenAI Gym’s got some basic tasks, like pole balancing, and pendulum uprighting, and some more difficult ones like basic Atari games like Space Invaders. ● like an enclosed world, or a “gym” to exercise and develop RL algorithms. ● OpenAI Universe has a much wider variety of tasks, and is more involved in giving RL networks/algorithms the ability to interact with the real world: playing games, using an actual (virtual) keyboard and mouse to interact with buttons and sliders on webpages, etc. ● Universe is based on Gym 15
  • 16. Use Cases Environments for doing various tasks, like ● Sending an email, ● Doing some mouse clicking, keyboard events, ● More and more environments are being added 16
  • 17. Thank You Ashish Kumar LinkedIn: https://www.linkedin.com/in/ashkmr1 Twitter : @ashish_fagna E-mail: [email protected] 17