SlideShare a Scribd company logo
Statistics in Python Pandas Exercise
Big Data and Automated Content Analysis
Week 5 – Wednesday
»Statistics with Python«
Damian Trilling
d.c.trilling@uva.nl
@damian0604
www.damiantrilling.net
Afdeling Communicatiewetenschap
Universiteit van Amsterdam
7 March 2018
Big Data and Automated Content Analysis Damian Trilling
Statistics in Python Pandas Exercise
Today
1 Statistics in Python
General considerations
Useful packages
2 Pandas
Working with dataframes
Plotting and calculating with Pandas
3 Exercise
Big Data and Automated Content Analysis Damian Trilling
Statistics in Python
General considerations
Statistics in Python Pandas Exercise
General considerations
General considerations
After having done all your nice text processing (and got numbers
instead of text!), you probably want to analyse this further.
You can always export to .csv and use R or Stata or SPSS or
whatever. . .
Big Data and Automated Content Analysis Damian Trilling
Statistics in Python Pandas Exercise
General considerations
General considerations
After having done all your nice text processing (and got numbers
instead of text!), you probably want to analyse this further.
You can always export to .csv and use R or Stata or SPSS or
whatever. . .
BUT:
Big Data and Automated Content Analysis Damian Trilling
Statistics in Python Pandas Exercise
General considerations
Reasons for not exporting and analyzing somewhere else
‱ the dataset might be too big
‱ it’s cumbersome and wastes your time
‱ it may introduce errors and makes it harder to reproduce
Big Data and Automated Content Analysis Damian Trilling
Statistics in Python Pandas Exercise
General considerations
What statistics capabilities does Python have?
‱ Basically all standard stuïŹ€ (bivariate and multivariate
statistics) you know from SPSS
‱ Some advanced stuïŹ€ (e.g., time series analysis)
‱ However, for some fancy statistical modelling (e.g., structural
equation modelling), you can better look somewhere else (R)
Big Data and Automated Content Analysis Damian Trilling
Statistics in Python
Useful packages
Statistics in Python Pandas Exercise
Useful packages
Useful packages
numpy (numerical python) Provides a lot of frequently used
functions, like mean, standard deviation, correlation,
. . .
scipy (scientic python) More of that ;-)
statsmodels Statistical models (e.g., regression or time series)
matplotlib Plotting
seaborn Even nicer plotting
Big Data and Automated Content Analysis Damian Trilling
Statistics in Python Pandas Exercise
Useful packages
Example 1: basic numpy
1 import numpy as np
2 x = [1,2,3,4,3,2]
3 y = [2,2,4,3,4,2]
4 z = [9.7, 10.2, 1.2, 3.3, 2.2, 55.6]
5 np.mean(x)
1 2.5
1 np.std(x)
1 0.9574271077563381
1 np.corrcoef([x,y,z])
1 array([[ 1. , 0.67883359, -0.37256219],
2 [ 0.67883359, 1. , -0.56886529],
3 [-0.37256219, -0.56886529, 1. ]])
Big Data and Automated Content Analysis Damian Trilling
Statistics in Python Pandas Exercise
Useful packages
Characteristics
‱ Operates (also) on simple lists
‱ Returns output in standard datatypes (you can print it, store
it, calculate with it, . . . )
‱ it’s fast! np.mean(x) is faster than sum(x)/len(x)
‱ it is more accurate (less rounding errors)
Big Data and Automated Content Analysis Damian Trilling
Statistics in Python Pandas Exercise
Useful packages
Example 2: basic plotting
1 import matplotlib.pyplot as plt
2 x = [1,2,3,4,3,2]
3 y = [2,2,4,3,4,2]
4 plt.hist(x)
5 plt.plot(x,y)
6 plt.scatter(x,y)
Figure: Examples of plots generated with matplotlib
Big Data and Automated Content Analysis Damian Trilling
Pandas
Working with dataframes
Statistics in Python Pandas Exercise
Working with dataframes
When to use dataframes
Native Python data structures
(lists, dicts, generators)
pro:
‱ ïŹ‚exible (especially dicts!)
‱ fast
‱ straightforward and easy to
understand
con:
‱ if your data is a table, modeling
this as, e.g., lists of lists feels
unintuitive
‱ very low-level: you need to do
much stuïŹ€ ‘by hand’
Big Data and Automated Content Analysis Damian Trilling
Statistics in Python Pandas Exercise
Working with dataframes
When to use dataframes
Native Python data structures
(lists, dicts, generators)
pro:
‱ ïŹ‚exible (especially dicts!)
‱ fast
‱ straightforward and easy to
understand
con:
‱ if your data is a table, modeling
this as, e.g., lists of lists feels
unintuitive
‱ very low-level: you need to do
much stuïŹ€ ‘by hand’
Pandas dataframes
pro:
‱ like an R dataframe or a STATA
or SPSS dataset
‱ many convenience functions
(descriptive statistics, plotting
over time, grouping and
subsetting, . . . )
con:
‱ not always necessary (‘overkill’)
‱ if you deal with really large
datasets, you don’t want to load
them fully into memory (which
pandas does)
Big Data and Automated Content Analysis Damian Trilling
Pandas
Plotting and calculating with Pandas
Statistics in Python Pandas Exercise
Plotting and calculating with Pandas
More examples here: https://github.com/damian0604/bdaca/
blob/master/ipynb/basic_statistics.ipynb
Big Data and Automated Content Analysis Damian Trilling
Statistics in Python Pandas Exercise
Plotting and calculating with Pandas
OLS regression in pandas
1 import pandas as pd
2 import statsmodels.formula.api as smf
3
4 df = pd.DataFrame({’income’: [10,20,30,40,50], ’age’: [20, 30, 10, 40,
50], ’facebooklikes’: [32, 234, 23, 23, 42523]})
5
6 # alternative: read from CSV file (or stata...):
7 # df = pd.read_csv(’mydata.csv’)
8
9 myfittedregression = smf.ols(formula=’income ~ age + facebooklikes’,
data=df).fit()
10 print(myfittedregression.summary())
Big Data and Automated Content Analysis Damian Trilling
1 OLS Regression Results
2 ==============================================================================
3 Dep. Variable: income R-squared: 0.579
4 Model: OLS Adj. R-squared: 0.158
5 Method: Least Squares F-statistic: 1.375
6 Date: Mon, 05 Mar 2018 Prob (F-statistic): 0.421
7 Time: 18:07:29 Log-Likelihood: -18.178
8 No. Observations: 5 AIC: 42.36
9 Df Residuals: 2 BIC: 41.19
10 Df Model: 2
11 Covariance Type: nonrobust
12 =================================================================================
13 coef std err t P>|t| [95.0% Conf. Int.]
14 ---------------------------------------------------------------------------------
15 Intercept 14.9525 17.764 0.842 0.489 -61.481 91.386
16 age 0.4012 0.650 0.617 0.600 -2.394 3.197
17 facebooklikes 0.0004 0.001 0.650 0.583 -0.002 0.003
18 ==============================================================================
19 Omnibus: nan Durbin-Watson: 1.061
20 Prob(Omnibus): nan Jarque-Bera (JB): 0.498
21 Skew: -0.123 Prob(JB): 0.780
22 Kurtosis: 1.474 Cond. No. 5.21e+04
23 ==============================================================================
Statistics in Python Pandas Exercise
Plotting and calculating with Pandas
Other cool df operations
df[’age’].plot() to plot a column
df[’age’].describe() to get descriptive statistics
df[’age’].value_counts() to get a frequency table
and MUCH more. . .
Big Data and Automated Content Analysis Damian Trilling
Joanna will introduce you to the exercise
... and of course you can also ask questions about the last weeks if
you still have some!

More Related Content

What's hot (20)

PDF
Analyzing social media with Python and other tools (1/4)
Department of Communication Science, University of Amsterdam
 
PDF
Python cheat-sheet
srinivasanr281952
 
PPTX
Python for Big Data Analytics
Edureka!
 
PPTX
Ground Gurus - Python Code Camp - Day 3 - Classes
Chariza Pladin
 
PPT
Searching algorithm
MG Thushara Pradeesh
 
PPTX
Programming for Everybody in Python
Charles Severance
 
PPTX
Introduction to Python for Data Science and Machine Learning
ParrotAI
 
PDF
Python interview questions
Pragati Singh
 
PDF
Python Interview Questions And Answers 2019 | Edureka
Edureka!
 
PPTX
Introduction to python
Ayshwarya Baburam
 
PDF
pycon-2015-liza-daly
Liza Daly
 
PDF
Most Asked Python Interview Questions
Shubham Shrimant
 
Analyzing social media with Python and other tools (1/4)
Department of Communication Science, University of Amsterdam
 
Python cheat-sheet
srinivasanr281952
 
Python for Big Data Analytics
Edureka!
 
Ground Gurus - Python Code Camp - Day 3 - Classes
Chariza Pladin
 
Searching algorithm
MG Thushara Pradeesh
 
Programming for Everybody in Python
Charles Severance
 
Introduction to Python for Data Science and Machine Learning
ParrotAI
 
Python interview questions
Pragati Singh
 
Python Interview Questions And Answers 2019 | Edureka
Edureka!
 
Introduction to python
Ayshwarya Baburam
 
pycon-2015-liza-daly
Liza Daly
 
Most Asked Python Interview Questions
Shubham Shrimant
 

Similar to BDACA - Tutorial5 (20)

PPTX
Meetup Junio Data Analysis with python 2018
DataLab Community
 
PDF
A Gentle Introduction to Coding ... with Python
Tariq Rashid
 
PPTX
PYTHON-Chapter 4-Plotting and Data Science PyLab - MAULIK BORSANIYA
Maulik Borsaniya
 
PPTX
Comparing EDA with classical and Bayesian analysis.pptx
PremaGanesh1
 
PPTX
Statistics in Data Science with Python
Mahe Karim
 
PPTX
Data Science with Python course Outline.pptx
Ferdsilinks
 
PPTX
Introduction to ML_Data Preprocessing.pptx
mousmiin
 
PPTX
To understand the importance of Python libraries in data analysis.
GurpinderSingh98
 
PDF
Turbocharge your data science with python and r
Kelli-Jean Chun
 
PDF
DataCamp Cheat Sheets 4 Python Users (2020)
EMRE AKCAOGLU
 
PPTX
Python Pandas.pptx1234567891234567891234
kanakishacker
 
PDF
Sea Amsterdam 2014 November 19
GoDataDriven
 
PPTX
PPT on Data Science Using Python
NishantKumar1179
 
PDF
Download full ebook of Mastering Pandas Femi Anthony instant download pdf
siefphor
 
PPTX
Lecture3.pptx
JohnMichaelPadernill
 
PDF
Jupyter Notebooks for machine learning on Kubernetes & OpenShift | DevNation ...
Red Hat Developers
 
PDF
Analysis using r
Priya Mohan
 
PDF
Congrats ! You got your Data Science Job
Rohit Dubey
 
PPTX
python-pandas-For-Data-Analysis-Manipulate.pptx
PLOKESH8
 
PPTX
Data Science.pptx
TrainerAnalogicx
 
Meetup Junio Data Analysis with python 2018
DataLab Community
 
A Gentle Introduction to Coding ... with Python
Tariq Rashid
 
PYTHON-Chapter 4-Plotting and Data Science PyLab - MAULIK BORSANIYA
Maulik Borsaniya
 
Comparing EDA with classical and Bayesian analysis.pptx
PremaGanesh1
 
Statistics in Data Science with Python
Mahe Karim
 
Data Science with Python course Outline.pptx
Ferdsilinks
 
Introduction to ML_Data Preprocessing.pptx
mousmiin
 
To understand the importance of Python libraries in data analysis.
GurpinderSingh98
 
Turbocharge your data science with python and r
Kelli-Jean Chun
 
DataCamp Cheat Sheets 4 Python Users (2020)
EMRE AKCAOGLU
 
Python Pandas.pptx1234567891234567891234
kanakishacker
 
Sea Amsterdam 2014 November 19
GoDataDriven
 
PPT on Data Science Using Python
NishantKumar1179
 
Download full ebook of Mastering Pandas Femi Anthony instant download pdf
siefphor
 
Lecture3.pptx
JohnMichaelPadernill
 
Jupyter Notebooks for machine learning on Kubernetes & OpenShift | DevNation ...
Red Hat Developers
 
Analysis using r
Priya Mohan
 
Congrats ! You got your Data Science Job
Rohit Dubey
 
python-pandas-For-Data-Analysis-Manipulate.pptx
PLOKESH8
 
Data Science.pptx
TrainerAnalogicx
 
Ad

More from Department of Communication Science, University of Amsterdam (18)

PDF
Media diets in an age of apps and social media: Dealing with a third layer of...
Department of Communication Science, University of Amsterdam
 
PDF
Conceptualizing and measuring news exposure as network of users and news items
Department of Communication Science, University of Amsterdam
 
PDF
Data Science: Case "Political Communication 2/2"
Department of Communication Science, University of Amsterdam
 
PDF
Data Science: Case "Political Communication 1/2"
Department of Communication Science, University of Amsterdam
 
PPTX
Should we worry about filter bubbles?
Department of Communication Science, University of Amsterdam
 
Media diets in an age of apps and social media: Dealing with a third layer of...
Department of Communication Science, University of Amsterdam
 
Conceptualizing and measuring news exposure as network of users and news items
Department of Communication Science, University of Amsterdam
 
Data Science: Case "Political Communication 2/2"
Department of Communication Science, University of Amsterdam
 
Data Science: Case "Political Communication 1/2"
Department of Communication Science, University of Amsterdam
 
Should we worry about filter bubbles?
Department of Communication Science, University of Amsterdam
 
Ad

Recently uploaded (20)

PPTX
ENGLISH 8 WEEK 3 Q1 - Analyzing the linguistic, historical, andor biographica...
OliverOllet
 
PPTX
Artificial Intelligence in Gastroentrology: Advancements and Future Presprec...
AyanHossain
 
PPTX
How to Track Skills & Contracts Using Odoo 18 Employee
Celine George
 
PDF
Module 2: Public Health History [Tutorial Slides]
JonathanHallett4
 
PPTX
The Future of Artificial Intelligence Opportunities and Risks Ahead
vaghelajayendra784
 
PPTX
How to Close Subscription in Odoo 18 - Odoo Slides
Celine George
 
PPTX
Gupta Art & Architecture Temple and Sculptures.pptx
Virag Sontakke
 
PPTX
Python-Application-in-Drug-Design by R D Jawarkar.pptx
Rahul Jawarkar
 
PPTX
Continental Accounting in Odoo 18 - Odoo Slides
Celine George
 
PDF
John Keats introduction and list of his important works
vatsalacpr
 
PDF
EXCRETION-STRUCTURE OF NEPHRON,URINE FORMATION
raviralanaresh2
 
PPTX
Sonnet 130_ My Mistress’ Eyes Are Nothing Like the Sun By William Shakespear...
DhatriParmar
 
PDF
My Thoughts On Q&A- A Novel By Vikas Swarup
Niharika
 
PPTX
INTESTINALPARASITES OR WORM INFESTATIONS.pptx
PRADEEP ABOTHU
 
PPTX
Cybersecurity: How to Protect your Digital World from Hackers
vaidikpanda4
 
PPTX
20250924 Navigating the Future: How to tell the difference between an emergen...
McGuinness Institute
 
PPTX
Top 10 AI Tools, Like ChatGPT. You Must Learn In 2025
Digilearnings
 
PDF
The-Invisible-Living-World-Beyond-Our-Naked-Eye chapter 2.pdf/8th science cur...
Sandeep Swamy
 
PPTX
Applications of matrices In Real Life_20250724_091307_0000.pptx
gehlotkrish03
 
PPTX
Introduction to pediatric nursing in 5th Sem..pptx
AneetaSharma15
 
ENGLISH 8 WEEK 3 Q1 - Analyzing the linguistic, historical, andor biographica...
OliverOllet
 
Artificial Intelligence in Gastroentrology: Advancements and Future Presprec...
AyanHossain
 
How to Track Skills & Contracts Using Odoo 18 Employee
Celine George
 
Module 2: Public Health History [Tutorial Slides]
JonathanHallett4
 
The Future of Artificial Intelligence Opportunities and Risks Ahead
vaghelajayendra784
 
How to Close Subscription in Odoo 18 - Odoo Slides
Celine George
 
Gupta Art & Architecture Temple and Sculptures.pptx
Virag Sontakke
 
Python-Application-in-Drug-Design by R D Jawarkar.pptx
Rahul Jawarkar
 
Continental Accounting in Odoo 18 - Odoo Slides
Celine George
 
John Keats introduction and list of his important works
vatsalacpr
 
EXCRETION-STRUCTURE OF NEPHRON,URINE FORMATION
raviralanaresh2
 
Sonnet 130_ My Mistress’ Eyes Are Nothing Like the Sun By William Shakespear...
DhatriParmar
 
My Thoughts On Q&A- A Novel By Vikas Swarup
Niharika
 
INTESTINALPARASITES OR WORM INFESTATIONS.pptx
PRADEEP ABOTHU
 
Cybersecurity: How to Protect your Digital World from Hackers
vaidikpanda4
 
20250924 Navigating the Future: How to tell the difference between an emergen...
McGuinness Institute
 
Top 10 AI Tools, Like ChatGPT. You Must Learn In 2025
Digilearnings
 
The-Invisible-Living-World-Beyond-Our-Naked-Eye chapter 2.pdf/8th science cur...
Sandeep Swamy
 
Applications of matrices In Real Life_20250724_091307_0000.pptx
gehlotkrish03
 
Introduction to pediatric nursing in 5th Sem..pptx
AneetaSharma15
 

BDACA - Tutorial5

  • 1. Statistics in Python Pandas Exercise Big Data and Automated Content Analysis Week 5 – Wednesday »Statistics with Python« Damian Trilling [email protected] @damian0604 www.damiantrilling.net Afdeling Communicatiewetenschap Universiteit van Amsterdam 7 March 2018 Big Data and Automated Content Analysis Damian Trilling
  • 2. Statistics in Python Pandas Exercise Today 1 Statistics in Python General considerations Useful packages 2 Pandas Working with dataframes Plotting and calculating with Pandas 3 Exercise Big Data and Automated Content Analysis Damian Trilling
  • 4. Statistics in Python Pandas Exercise General considerations General considerations After having done all your nice text processing (and got numbers instead of text!), you probably want to analyse this further. You can always export to .csv and use R or Stata or SPSS or whatever. . . Big Data and Automated Content Analysis Damian Trilling
  • 5. Statistics in Python Pandas Exercise General considerations General considerations After having done all your nice text processing (and got numbers instead of text!), you probably want to analyse this further. You can always export to .csv and use R or Stata or SPSS or whatever. . . BUT: Big Data and Automated Content Analysis Damian Trilling
  • 6. Statistics in Python Pandas Exercise General considerations Reasons for not exporting and analyzing somewhere else ‱ the dataset might be too big ‱ it’s cumbersome and wastes your time ‱ it may introduce errors and makes it harder to reproduce Big Data and Automated Content Analysis Damian Trilling
  • 7. Statistics in Python Pandas Exercise General considerations What statistics capabilities does Python have? ‱ Basically all standard stuïŹ€ (bivariate and multivariate statistics) you know from SPSS ‱ Some advanced stuïŹ€ (e.g., time series analysis) ‱ However, for some fancy statistical modelling (e.g., structural equation modelling), you can better look somewhere else (R) Big Data and Automated Content Analysis Damian Trilling
  • 9. Statistics in Python Pandas Exercise Useful packages Useful packages numpy (numerical python) Provides a lot of frequently used functions, like mean, standard deviation, correlation, . . . scipy (scientic python) More of that ;-) statsmodels Statistical models (e.g., regression or time series) matplotlib Plotting seaborn Even nicer plotting Big Data and Automated Content Analysis Damian Trilling
  • 10. Statistics in Python Pandas Exercise Useful packages Example 1: basic numpy 1 import numpy as np 2 x = [1,2,3,4,3,2] 3 y = [2,2,4,3,4,2] 4 z = [9.7, 10.2, 1.2, 3.3, 2.2, 55.6] 5 np.mean(x) 1 2.5 1 np.std(x) 1 0.9574271077563381 1 np.corrcoef([x,y,z]) 1 array([[ 1. , 0.67883359, -0.37256219], 2 [ 0.67883359, 1. , -0.56886529], 3 [-0.37256219, -0.56886529, 1. ]]) Big Data and Automated Content Analysis Damian Trilling
  • 11. Statistics in Python Pandas Exercise Useful packages Characteristics ‱ Operates (also) on simple lists ‱ Returns output in standard datatypes (you can print it, store it, calculate with it, . . . ) ‱ it’s fast! np.mean(x) is faster than sum(x)/len(x) ‱ it is more accurate (less rounding errors) Big Data and Automated Content Analysis Damian Trilling
  • 12. Statistics in Python Pandas Exercise Useful packages Example 2: basic plotting 1 import matplotlib.pyplot as plt 2 x = [1,2,3,4,3,2] 3 y = [2,2,4,3,4,2] 4 plt.hist(x) 5 plt.plot(x,y) 6 plt.scatter(x,y) Figure: Examples of plots generated with matplotlib Big Data and Automated Content Analysis Damian Trilling
  • 14. Statistics in Python Pandas Exercise Working with dataframes When to use dataframes Native Python data structures (lists, dicts, generators) pro: ‱ ïŹ‚exible (especially dicts!) ‱ fast ‱ straightforward and easy to understand con: ‱ if your data is a table, modeling this as, e.g., lists of lists feels unintuitive ‱ very low-level: you need to do much stuïŹ€ ‘by hand’ Big Data and Automated Content Analysis Damian Trilling
  • 15. Statistics in Python Pandas Exercise Working with dataframes When to use dataframes Native Python data structures (lists, dicts, generators) pro: ‱ ïŹ‚exible (especially dicts!) ‱ fast ‱ straightforward and easy to understand con: ‱ if your data is a table, modeling this as, e.g., lists of lists feels unintuitive ‱ very low-level: you need to do much stuïŹ€ ‘by hand’ Pandas dataframes pro: ‱ like an R dataframe or a STATA or SPSS dataset ‱ many convenience functions (descriptive statistics, plotting over time, grouping and subsetting, . . . ) con: ‱ not always necessary (‘overkill’) ‱ if you deal with really large datasets, you don’t want to load them fully into memory (which pandas does) Big Data and Automated Content Analysis Damian Trilling
  • 17. Statistics in Python Pandas Exercise Plotting and calculating with Pandas More examples here: https://github.com/damian0604/bdaca/ blob/master/ipynb/basic_statistics.ipynb Big Data and Automated Content Analysis Damian Trilling
  • 18. Statistics in Python Pandas Exercise Plotting and calculating with Pandas OLS regression in pandas 1 import pandas as pd 2 import statsmodels.formula.api as smf 3 4 df = pd.DataFrame({’income’: [10,20,30,40,50], ’age’: [20, 30, 10, 40, 50], ’facebooklikes’: [32, 234, 23, 23, 42523]}) 5 6 # alternative: read from CSV file (or stata...): 7 # df = pd.read_csv(’mydata.csv’) 8 9 myfittedregression = smf.ols(formula=’income ~ age + facebooklikes’, data=df).fit() 10 print(myfittedregression.summary()) Big Data and Automated Content Analysis Damian Trilling
  • 19. 1 OLS Regression Results 2 ============================================================================== 3 Dep. Variable: income R-squared: 0.579 4 Model: OLS Adj. R-squared: 0.158 5 Method: Least Squares F-statistic: 1.375 6 Date: Mon, 05 Mar 2018 Prob (F-statistic): 0.421 7 Time: 18:07:29 Log-Likelihood: -18.178 8 No. Observations: 5 AIC: 42.36 9 Df Residuals: 2 BIC: 41.19 10 Df Model: 2 11 Covariance Type: nonrobust 12 ================================================================================= 13 coef std err t P>|t| [95.0% Conf. Int.] 14 --------------------------------------------------------------------------------- 15 Intercept 14.9525 17.764 0.842 0.489 -61.481 91.386 16 age 0.4012 0.650 0.617 0.600 -2.394 3.197 17 facebooklikes 0.0004 0.001 0.650 0.583 -0.002 0.003 18 ============================================================================== 19 Omnibus: nan Durbin-Watson: 1.061 20 Prob(Omnibus): nan Jarque-Bera (JB): 0.498 21 Skew: -0.123 Prob(JB): 0.780 22 Kurtosis: 1.474 Cond. No. 5.21e+04 23 ==============================================================================
  • 20. Statistics in Python Pandas Exercise Plotting and calculating with Pandas Other cool df operations df[’age’].plot() to plot a column df[’age’].describe() to get descriptive statistics df[’age’].value_counts() to get a frequency table and MUCH more. . . Big Data and Automated Content Analysis Damian Trilling
  • 21. Joanna will introduce you to the exercise ... and of course you can also ask questions about the last weeks if you still have some!