SlideShare a Scribd company logo
PyData
  NYC 2012
Python and Big Data
• Python has become an established language for
 scientific, engineering, and technical computing

• Ubiquitous in industry for all kinds of problems
 large and small
 - National labs
 - Finance
 - Oil & Gas
 - Consumer Products
 - Aerospace / Defense
Next Steps
• Challenge is now to build/expand this community around
 out-of-core and distributed data structures and
 algorithms.

• A unifying focus of the PyData community.
• The accessibility of Python syntax will empower the next
 generation of “data scientists” just as it has empowered
 this generation of “real” scientists.
A few thoughts on how
• Integration with as much as possible: Python is and
 should remain the ultimate glue (need Python .JVM
 equivalent of Python .NET)

• Compelling new features (Python compiler, out-of-
 core data structures, R-inspired data-frames with
 hierarchical indexing, meta-data enhanced plotting)

• Focus on pragmatic solving of real problems easily not
 on language fanboyism or feature-fetish. Empower
 domain experts and the occasional programmer.
Thank you!
• Our Sponsors
  -   DE Shaw and Co
  -   Appnexus
  -   JPMorgan
  -   NumFOCUS
  -   PSF

• Our Organizers:      People at Continuum Analytics, Inc., Lambda-Foundry


• Our Speakers: They put a lot of effort into both the talks and the
 work being discussed

• Attendees: your participation makes this all possible!
Reminders
• Sign up for the hack-a-thon: http://pydata.eventbrite.com/
 - Bring your photo-ID that matches your registration name
 - Need to be on the list to be admitted. List being sent over at noon
 - Sign up to lead BOFs, sprints, or demos at wiki page: http://tinyurl.com/
   pydata-sprints
• Register for Dinner tonight (included in ticket)
 - Continuum will present a brief overview of our products and services
   and answer questions but otherwise it will be lively discussion among
   friends
  -Special networking track for students and other people looking for jobs
• PyData West in Santa Clara: March 19-21, 2013
  - http://pydatawest2013.pydata.org
PyData
  NYC 2012

More Related Content

What's hot (16)

PDF
PyData: Past, Present Future (PyData SV 2014 Keynote)
Peter Wang
 
PDF
Scale up and Scale Out Anaconda and PyData
Travis Oliphant
 
PDF
Bringing an AI Ecosystem to the Domain Expert and Enterprise AI Developer wit...
Databricks
 
PDF
DataFrames: The Extended Cut
Wes McKinney
 
PDF
Large Scale Processing of Unstructured Text
DataWorks Summit
 
PDF
Deep learning on HDP 2018 Prague
Timothy Spann
 
PDF
data.table and H2O at LondonR with Matt Dowle
Sri Ambati
 
PPTX
Arun Rathinasabapathy, Senior Software Engineer, LexisNexis at MLconf ATL 2016
MLconf
 
PDF
DeepLearning4J and Spark: Successes and Challenges - François Garillot
Steve Moore
 
PDF
Latest Developments in H2O
Sri Ambati
 
PDF
Apache Spark for Everyone - Women Who Code Workshop
Amanda Casari
 
PDF
Social Networks Analysis
Joud Khattab
 
PPTX
Hadoop or Spark: is it an either-or proposition? By Slim Baltagi
Slim Baltagi
 
PDF
H2O Deep Water - Making Deep Learning Accessible to Everyone
Sri Ambati
 
PPTX
ISAX
Sri Ambati
 
PDF
H2O Big Join Slides
Sri Ambati
 
PyData: Past, Present Future (PyData SV 2014 Keynote)
Peter Wang
 
Scale up and Scale Out Anaconda and PyData
Travis Oliphant
 
Bringing an AI Ecosystem to the Domain Expert and Enterprise AI Developer wit...
Databricks
 
DataFrames: The Extended Cut
Wes McKinney
 
Large Scale Processing of Unstructured Text
DataWorks Summit
 
Deep learning on HDP 2018 Prague
Timothy Spann
 
data.table and H2O at LondonR with Matt Dowle
Sri Ambati
 
Arun Rathinasabapathy, Senior Software Engineer, LexisNexis at MLconf ATL 2016
MLconf
 
DeepLearning4J and Spark: Successes and Challenges - François Garillot
Steve Moore
 
Latest Developments in H2O
Sri Ambati
 
Apache Spark for Everyone - Women Who Code Workshop
Amanda Casari
 
Social Networks Analysis
Joud Khattab
 
Hadoop or Spark: is it an either-or proposition? By Slim Baltagi
Slim Baltagi
 
H2O Deep Water - Making Deep Learning Accessible to Everyone
Sri Ambati
 
H2O Big Join Slides
Sri Ambati
 

Similar to PyData Introduction (20)

PDF
Big data berlin
kammeyer
 
PDF
London level39
Travis Oliphant
 
PDF
RDM 2020: Python, Numpy, and Pandas
Henry Schreiner
 
PDF
SciPy Latin America 2019
Travis Oliphant
 
PDF
Anaconda and PyData Solutions
Travis Oliphant
 
PDF
Introduction to Python Syntax and Semantics
Adam Cook
 
PDF
PyCon Singapore 2013 Keynote
Wes McKinney
 
PDF
Exploring and Using the Python Ecosystem
Adam Cook
 
PPTX
Adarsh_Masekar(2GP19CS003).pptx
hkabir55
 
PDF
WALD: A Modern & Sustainable Analytics Stack
Florian Wilhelm
 
PDF
PyCon Colombia 2020 Python for Data Analysis: Past, Present, and Future
Wes McKinney
 
PDF
The road ahead for scientific computing with Python
Ralf Gommers
 
PPTX
What is Python? An overview of Python for science.
Nicholas Pringle
 
PPTX
PyData: The Next Generation | Data Day Texas 2015
Cloudera, Inc.
 
PDF
When big data meet python @ COSCUP 2012
Jimmy Lai
 
PDF
Accelerating Big Data beyond the JVM - Fosdem 2018
Holden Karau
 
PDF
Python webinar 4th june
Edureka!
 
PPTX
Webinar: Mastering Python - An Excellent tool for Web Scraping and Data Anal...
Edureka!
 
PDF
Python in Industry
Dharmit Shah
 
Big data berlin
kammeyer
 
London level39
Travis Oliphant
 
RDM 2020: Python, Numpy, and Pandas
Henry Schreiner
 
SciPy Latin America 2019
Travis Oliphant
 
Anaconda and PyData Solutions
Travis Oliphant
 
Introduction to Python Syntax and Semantics
Adam Cook
 
PyCon Singapore 2013 Keynote
Wes McKinney
 
Exploring and Using the Python Ecosystem
Adam Cook
 
Adarsh_Masekar(2GP19CS003).pptx
hkabir55
 
WALD: A Modern & Sustainable Analytics Stack
Florian Wilhelm
 
PyCon Colombia 2020 Python for Data Analysis: Past, Present, and Future
Wes McKinney
 
The road ahead for scientific computing with Python
Ralf Gommers
 
What is Python? An overview of Python for science.
Nicholas Pringle
 
PyData: The Next Generation | Data Day Texas 2015
Cloudera, Inc.
 
When big data meet python @ COSCUP 2012
Jimmy Lai
 
Accelerating Big Data beyond the JVM - Fosdem 2018
Holden Karau
 
Python webinar 4th june
Edureka!
 
Webinar: Mastering Python - An Excellent tool for Web Scraping and Data Anal...
Edureka!
 
Python in Industry
Dharmit Shah
 
Ad

More from Travis Oliphant (11)

PDF
Array computing and the evolution of SciPy, NumPy, and PyData
Travis Oliphant
 
PDF
PyCon Estonia 2019
Travis Oliphant
 
PDF
Keynote at Converge 2019
Travis Oliphant
 
PDF
Standardizing arrays -- Microsoft Presentation
Travis Oliphant
 
PPTX
Scaling Python to CPUs and GPUs
Travis Oliphant
 
PPTX
Python for Data Science with Anaconda
Travis Oliphant
 
PDF
Effectively using Open Source with conda
Travis Oliphant
 
PDF
Blaze: a large-scale, array-oriented infrastructure for Python
Travis Oliphant
 
PDF
Numba: Array-oriented Python Compiler for NumPy
Travis Oliphant
 
KEY
Numba lightning
Travis Oliphant
 
KEY
Numba
Travis Oliphant
 
Array computing and the evolution of SciPy, NumPy, and PyData
Travis Oliphant
 
PyCon Estonia 2019
Travis Oliphant
 
Keynote at Converge 2019
Travis Oliphant
 
Standardizing arrays -- Microsoft Presentation
Travis Oliphant
 
Scaling Python to CPUs and GPUs
Travis Oliphant
 
Python for Data Science with Anaconda
Travis Oliphant
 
Effectively using Open Source with conda
Travis Oliphant
 
Blaze: a large-scale, array-oriented infrastructure for Python
Travis Oliphant
 
Numba: Array-oriented Python Compiler for NumPy
Travis Oliphant
 
Numba lightning
Travis Oliphant
 
Ad

Recently uploaded (20)

PPTX
"Autonomy of LLM Agents: Current State and Future Prospects", Oles` Petriv
Fwdays
 
PPTX
Q2 FY26 Tableau User Group Leader Quarterly Call
lward7
 
PDF
LLMs.txt: Easily Control How AI Crawls Your Site
Keploy
 
PDF
CIFDAQ Token Spotlight for 9th July 2025
CIFDAQ
 
PDF
Using FME to Develop Self-Service CAD Applications for a Major UK Police Force
Safe Software
 
PDF
Mastering Financial Management in Direct Selling
Epixel MLM Software
 
PDF
POV_ Why Enterprises Need to Find Value in ZERO.pdf
darshakparmar
 
PDF
Reverse Engineering of Security Products: Developing an Advanced Microsoft De...
nwbxhhcyjv
 
PPTX
AI Penetration Testing Essentials: A Cybersecurity Guide for 2025
defencerabbit Team
 
PDF
Agentic AI lifecycle for Enterprise Hyper-Automation
Debmalya Biswas
 
PPTX
AUTOMATION AND ROBOTICS IN PHARMA INDUSTRY.pptx
sameeraaabegumm
 
PDF
July Patch Tuesday
Ivanti
 
PDF
Exolore The Essential AI Tools in 2025.pdf
Srinivasan M
 
PDF
Bitcoin for Millennials podcast with Bram, Power Laws of Bitcoin
Stephen Perrenod
 
PDF
Transcript: New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
PDF
HubSpot Main Hub: A Unified Growth Platform
Jaswinder Singh
 
PDF
"AI Transformation: Directions and Challenges", Pavlo Shaternik
Fwdays
 
PPTX
Building Search Using OpenSearch: Limitations and Workarounds
Sease
 
PDF
Biography of Daniel Podor.pdf
Daniel Podor
 
PDF
Building Real-Time Digital Twins with IBM Maximo & ArcGIS Indoors
Safe Software
 
"Autonomy of LLM Agents: Current State and Future Prospects", Oles` Petriv
Fwdays
 
Q2 FY26 Tableau User Group Leader Quarterly Call
lward7
 
LLMs.txt: Easily Control How AI Crawls Your Site
Keploy
 
CIFDAQ Token Spotlight for 9th July 2025
CIFDAQ
 
Using FME to Develop Self-Service CAD Applications for a Major UK Police Force
Safe Software
 
Mastering Financial Management in Direct Selling
Epixel MLM Software
 
POV_ Why Enterprises Need to Find Value in ZERO.pdf
darshakparmar
 
Reverse Engineering of Security Products: Developing an Advanced Microsoft De...
nwbxhhcyjv
 
AI Penetration Testing Essentials: A Cybersecurity Guide for 2025
defencerabbit Team
 
Agentic AI lifecycle for Enterprise Hyper-Automation
Debmalya Biswas
 
AUTOMATION AND ROBOTICS IN PHARMA INDUSTRY.pptx
sameeraaabegumm
 
July Patch Tuesday
Ivanti
 
Exolore The Essential AI Tools in 2025.pdf
Srinivasan M
 
Bitcoin for Millennials podcast with Bram, Power Laws of Bitcoin
Stephen Perrenod
 
Transcript: New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
HubSpot Main Hub: A Unified Growth Platform
Jaswinder Singh
 
"AI Transformation: Directions and Challenges", Pavlo Shaternik
Fwdays
 
Building Search Using OpenSearch: Limitations and Workarounds
Sease
 
Biography of Daniel Podor.pdf
Daniel Podor
 
Building Real-Time Digital Twins with IBM Maximo & ArcGIS Indoors
Safe Software
 

PyData Introduction

  • 1. PyData NYC 2012
  • 2. Python and Big Data • Python has become an established language for scientific, engineering, and technical computing • Ubiquitous in industry for all kinds of problems large and small - National labs - Finance - Oil & Gas - Consumer Products - Aerospace / Defense
  • 3. Next Steps • Challenge is now to build/expand this community around out-of-core and distributed data structures and algorithms. • A unifying focus of the PyData community. • The accessibility of Python syntax will empower the next generation of “data scientists” just as it has empowered this generation of “real” scientists.
  • 4. A few thoughts on how • Integration with as much as possible: Python is and should remain the ultimate glue (need Python .JVM equivalent of Python .NET) • Compelling new features (Python compiler, out-of- core data structures, R-inspired data-frames with hierarchical indexing, meta-data enhanced plotting) • Focus on pragmatic solving of real problems easily not on language fanboyism or feature-fetish. Empower domain experts and the occasional programmer.
  • 5. Thank you! • Our Sponsors - DE Shaw and Co - Appnexus - JPMorgan - NumFOCUS - PSF • Our Organizers: People at Continuum Analytics, Inc., Lambda-Foundry • Our Speakers: They put a lot of effort into both the talks and the work being discussed • Attendees: your participation makes this all possible!
  • 6. Reminders • Sign up for the hack-a-thon: http://pydata.eventbrite.com/ - Bring your photo-ID that matches your registration name - Need to be on the list to be admitted. List being sent over at noon - Sign up to lead BOFs, sprints, or demos at wiki page: http://tinyurl.com/ pydata-sprints • Register for Dinner tonight (included in ticket) - Continuum will present a brief overview of our products and services and answer questions but otherwise it will be lively discussion among friends -Special networking track for students and other people looking for jobs • PyData West in Santa Clara: March 19-21, 2013 - http://pydatawest2013.pydata.org
  • 7. PyData NYC 2012

Editor's Notes