Basics Exercise Next meetings
Big Data and Automated Content Analysis
Week 2 – Wednesday
»Getting started with Python«
Damian Trilling
d.c.trilling@uva.nl
@damian0604
www.damiantrilling.net
Afdeling Communicatiewetenschap
Universiteit van Amsterdam
8 April 2014
Big Data and Automated Content Analysis Damian Trilling
Basics Exercise Next meetings
Today
1 The very, very, basics of programming with Python
Datatypes
Indention: The Python way of structuring your program
2 Exercise
3 Next meetings
Big Data and Automated Content Analysis Damian Trilling
The very, very, basics of programming
You’ve read all this in chapter 3.
Basics Exercise Next meetings
Datatypes
Python lingo
Basic datatypes (variables)
int 32
float 1.75
bool True, False
string "Damian"
Big Data and Automated Content Analysis Damian Trilling
Basics Exercise Next meetings
Datatypes
Python lingo
Basic datatypes (variables)
int 32
float 1.75
bool True, False
string "Damian"
"5" and 5 is not the same.
But you can transform it: int("5") will return 5.
You cannot calculate 3 * "5".
But you can calculate 3 * int("5")
Big Data and Automated Content Analysis Damian Trilling
Basics Exercise Next meetings
Datatypes
Python lingo
More advanced datatypes
Note that the elements of a list, the keys of a dict, and the values
of a dict can have any datatype! (It should be consistent, though!)
Big Data and Automated Content Analysis Damian Trilling
Basics Exercise Next meetings
Datatypes
Python lingo
More advanced datatypes
list firstnames = [’Damian’,’Lori’,’Bjoern’]
lastnames =
[’Trilling’,’Meester’,’Burscher’]
Note that the elements of a list, the keys of a dict, and the values
of a dict can have any datatype! (It should be consistent, though!)
Big Data and Automated Content Analysis Damian Trilling
Basics Exercise Next meetings
Datatypes
Python lingo
More advanced datatypes
list firstnames = [’Damian’,’Lori’,’Bjoern’]
lastnames =
[’Trilling’,’Meester’,’Burscher’]
list ages = [18,22,45,23]
Note that the elements of a list, the keys of a dict, and the values
of a dict can have any datatype! (It should be consistent, though!)
Big Data and Automated Content Analysis Damian Trilling
Basics Exercise Next meetings
Datatypes
Python lingo
More advanced datatypes
list firstnames = [’Damian’,’Lori’,’Bjoern’]
lastnames =
[’Trilling’,’Meester’,’Burscher’]
list ages = [18,22,45,23]
dict familynames= {’Bjoern’: ’Burscher’,
’Damian’: ’Trilling’, ’Lori’: ’Meester’}
dict {’Bjoern’: 26, ’Damian’: 31, ’Lori’:
25}
Note that the elements of a list, the keys of a dict, and the values
of a dict can have any datatype! (It should be consistent, though!)
Big Data and Automated Content Analysis Damian Trilling
Basics Exercise Next meetings
Datatypes
Python lingo
Functions
Big Data and Automated Content Analysis Damian Trilling
Basics Exercise Next meetings
Datatypes
Python lingo
Functions
functions Take an input and return something else
int(32.43) returns the integer 32. len("Hello")
returns the integer 5.
Big Data and Automated Content Analysis Damian Trilling
Basics Exercise Next meetings
Datatypes
Python lingo
Functions
functions Take an input and return something else
int(32.43) returns the integer 32. len("Hello")
returns the integer 5.
methods are similar to functions, but directly associated with
an object. "SCREAM".lower() returns the string
"scream"
Big Data and Automated Content Analysis Damian Trilling
Basics Exercise Next meetings
Datatypes
Python lingo
Functions
functions Take an input and return something else
int(32.43) returns the integer 32. len("Hello")
returns the integer 5.
methods are similar to functions, but directly associated with
an object. "SCREAM".lower() returns the string
"scream"
Both functions and methods end with (). Between the (),
arguments can (sometimes have to) be supplied.
Big Data and Automated Content Analysis Damian Trilling
Indention: The Python way of structuring your program
Basics Exercise Next meetings
Indention
Indention
Structure
The program is structured by TABs or SPACEs
1 firstnames=[’Damian’,’Lori’,’Bjoern’]
2 age={’Bjoern’: 27, ’Damian’: 32, ’Lori’: 26}
3 print ("The names and ages of all BigData people:")
4 for naam in firstnames:
5 print (naam,age[naam])
Big Data and Automated Content Analysis Damian Trilling
Basics Exercise Next meetings
Indention
Indention
Structure
The program is structured by TABs or SPACEs
1 firstnames=[’Damian’,’Lori’,’Bjoern’]
2 age={’Bjoern’: 27, ’Damian’: 32, ’Lori’: 26}
3 print ("The names and ages of all BigData people:")
4 for naam in firstnames:
5 print (naam,age[naam])
Don’t mix up TABs and spaces! Both are valid, but you have
to be consequent!!!
Big Data and Automated Content Analysis Damian Trilling
Basics Exercise Next meetings
Indention
Indention
Structure
The program is structured by TABs or SPACEs
1 print ("The names and ages of all BigData people:")
2 for naam in firstnames:
3 print (naam,age[naam])
4 if naam=="Damian":
5 print ("He teaches this course")
6 elif naam=="Lori":
7 print ("She was an assistant last year")
8 elif naam=="Bjoern":
9 print ("He helps on Wednesdays")
10 else:
11 print ("No idea who this is")
Big Data and Automated Content Analysis Damian Trilling
Basics Exercise Next meetings
Indention
Indention
The line before an indented block starts with a statement
indicating what should be done with the block and ends with a :
Big Data and Automated Content Analysis Damian Trilling
Basics Exercise Next meetings
Indention
Indention
The line before an indented block starts with a statement
indicating what should be done with the block and ends with a :
Indention of the block indicates that
Big Data and Automated Content Analysis Damian Trilling
Basics Exercise Next meetings
Indention
Indention
The line before an indented block starts with a statement
indicating what should be done with the block and ends with a :
Indention of the block indicates that
• it is to be executed repeatedly (for statement) – e.g., for
each element from a list
Big Data and Automated Content Analysis Damian Trilling
Basics Exercise Next meetings
Indention
Indention
The line before an indented block starts with a statement
indicating what should be done with the block and ends with a :
Indention of the block indicates that
• it is to be executed repeatedly (for statement) – e.g., for
each element from a list
• it is only to be executed under specific conditions (if, elif,
and else statements)
Big Data and Automated Content Analysis Damian Trilling
Basics Exercise Next meetings
Indention
Indention
The line before an indented block starts with a statement
indicating what should be done with the block and ends with a :
Indention of the block indicates that
• it is to be executed repeatedly (for statement) – e.g., for
each element from a list
• it is only to be executed under specific conditions (if, elif,
and else statements)
• an alternative block should be executed if an error occurs
(try and except statements)
Big Data and Automated Content Analysis Damian Trilling
Basics Exercise Next meetings
Indention
Indention
The line before an indented block starts with a statement
indicating what should be done with the block and ends with a :
Indention of the block indicates that
• it is to be executed repeatedly (for statement) – e.g., for
each element from a list
• it is only to be executed under specific conditions (if, elif,
and else statements)
• an alternative block should be executed if an error occurs
(try and except statements)
• a file is opened, but should be closed again after the block has
been executed (with statement)
Big Data and Automated Content Analysis Damian Trilling
Basics Exercise Next meetings
We’ll now together do the exercise “Describing an existing
structured dataset”.
Big Data and Automated Content Analysis Damian Trilling
Basics Exercise Next meetings
Next meetings
Big Data and Automated Content Analysis Damian Trilling
Basics Exercise Next meetings
Week 3: Data harvesting and storage
Monday, 13–4
A conceptual overview of APIs, scrapers, crawlers, RSS-feeds,
databases, and different file formats
Wednesday, 15–4
Writing some first data collection scripts
Preparation
• Conceptual level: Read the article by Morstatter, Pfeffer, Liu,
and Carley (2013) about the limitations of the Twitter API.
• Technical level: Make sure you are comfortable with the
techniques we’ve covered so far. Play around. Give
yourself some tasks and solve them. Google.
Big Data and Automated Content Analysis Damian Trilling

More Related Content

What's hot (20)

PPTX
Introduction to Python for Data Science and Machine Learning
PPTX
Introduction to python
PPTX
Python for Big Data Analytics
PDF
Python45 2
PPTX
Python training
PPTX
Python 3 Programming Language
PPTX
Python ppt
PPTX
Ground Gurus - Python Code Camp - Day 3 - Classes
PPTX
Pa1 session 2
PDF
Introduction To Programming with Python
PPTX
Python programming l2
PPTX
Python Tutorial Part 1
Introduction to Python for Data Science and Machine Learning
Introduction to python
Python for Big Data Analytics
Python45 2
Python training
Python 3 Programming Language
Python ppt
Ground Gurus - Python Code Camp - Day 3 - Classes
Pa1 session 2
Introduction To Programming with Python
Python programming l2
Python Tutorial Part 1
Ad

Similar to BD-ACA week2 (20)

PPTX
Coding in Kotlin with Arrow NIDC 2018
PDF
Python cheat-sheet
DOCX
These questions will be a bit advanced level 2
PDF
Code Evolution Day 2024 = Opening talk: Demystifying LLMs
PDF
Write better python code with these 10 tricks | by yong cui, ph.d. | aug, 202...
PPTX
manish python.pptx
PDF
A Gentle Introduction to Coding ... with Python
ODP
James Jesus Bermas on Crash Course on Python
PPT
Kavitha_python.ppt
PDF
What is ATT&CK coverage, anyway? Breadth and depth analysis with Atomic Red Team
PPTX
Programming in Python
PPTX
An Introduction To Python - Final Exam Review
PDF
AmI 2015 - Python basics
PPTX
Data_structures_and_algorithm_Lec_1.pptx
PPTX
Data_structures_and_algorithm_Lec_1.pptx
Coding in Kotlin with Arrow NIDC 2018
Python cheat-sheet
These questions will be a bit advanced level 2
Code Evolution Day 2024 = Opening talk: Demystifying LLMs
Write better python code with these 10 tricks | by yong cui, ph.d. | aug, 202...
manish python.pptx
A Gentle Introduction to Coding ... with Python
James Jesus Bermas on Crash Course on Python
Kavitha_python.ppt
What is ATT&CK coverage, anyway? Breadth and depth analysis with Atomic Red Team
Programming in Python
An Introduction To Python - Final Exam Review
AmI 2015 - Python basics
Data_structures_and_algorithm_Lec_1.pptx
Data_structures_and_algorithm_Lec_1.pptx
Ad

More from Department of Communication Science, University of Amsterdam (13)

BD-ACA week2

  • 1. Basics Exercise Next meetings Big Data and Automated Content Analysis Week 2 – Wednesday »Getting started with Python« Damian Trilling [email protected] @damian0604 www.damiantrilling.net Afdeling Communicatiewetenschap Universiteit van Amsterdam 8 April 2014 Big Data and Automated Content Analysis Damian Trilling
  • 2. Basics Exercise Next meetings Today 1 The very, very, basics of programming with Python Datatypes Indention: The Python way of structuring your program 2 Exercise 3 Next meetings Big Data and Automated Content Analysis Damian Trilling
  • 3. The very, very, basics of programming You’ve read all this in chapter 3.
  • 4. Basics Exercise Next meetings Datatypes Python lingo Basic datatypes (variables) int 32 float 1.75 bool True, False string "Damian" Big Data and Automated Content Analysis Damian Trilling
  • 5. Basics Exercise Next meetings Datatypes Python lingo Basic datatypes (variables) int 32 float 1.75 bool True, False string "Damian" "5" and 5 is not the same. But you can transform it: int("5") will return 5. You cannot calculate 3 * "5". But you can calculate 3 * int("5") Big Data and Automated Content Analysis Damian Trilling
  • 6. Basics Exercise Next meetings Datatypes Python lingo More advanced datatypes Note that the elements of a list, the keys of a dict, and the values of a dict can have any datatype! (It should be consistent, though!) Big Data and Automated Content Analysis Damian Trilling
  • 7. Basics Exercise Next meetings Datatypes Python lingo More advanced datatypes list firstnames = [’Damian’,’Lori’,’Bjoern’] lastnames = [’Trilling’,’Meester’,’Burscher’] Note that the elements of a list, the keys of a dict, and the values of a dict can have any datatype! (It should be consistent, though!) Big Data and Automated Content Analysis Damian Trilling
  • 8. Basics Exercise Next meetings Datatypes Python lingo More advanced datatypes list firstnames = [’Damian’,’Lori’,’Bjoern’] lastnames = [’Trilling’,’Meester’,’Burscher’] list ages = [18,22,45,23] Note that the elements of a list, the keys of a dict, and the values of a dict can have any datatype! (It should be consistent, though!) Big Data and Automated Content Analysis Damian Trilling
  • 9. Basics Exercise Next meetings Datatypes Python lingo More advanced datatypes list firstnames = [’Damian’,’Lori’,’Bjoern’] lastnames = [’Trilling’,’Meester’,’Burscher’] list ages = [18,22,45,23] dict familynames= {’Bjoern’: ’Burscher’, ’Damian’: ’Trilling’, ’Lori’: ’Meester’} dict {’Bjoern’: 26, ’Damian’: 31, ’Lori’: 25} Note that the elements of a list, the keys of a dict, and the values of a dict can have any datatype! (It should be consistent, though!) Big Data and Automated Content Analysis Damian Trilling
  • 10. Basics Exercise Next meetings Datatypes Python lingo Functions Big Data and Automated Content Analysis Damian Trilling
  • 11. Basics Exercise Next meetings Datatypes Python lingo Functions functions Take an input and return something else int(32.43) returns the integer 32. len("Hello") returns the integer 5. Big Data and Automated Content Analysis Damian Trilling
  • 12. Basics Exercise Next meetings Datatypes Python lingo Functions functions Take an input and return something else int(32.43) returns the integer 32. len("Hello") returns the integer 5. methods are similar to functions, but directly associated with an object. "SCREAM".lower() returns the string "scream" Big Data and Automated Content Analysis Damian Trilling
  • 13. Basics Exercise Next meetings Datatypes Python lingo Functions functions Take an input and return something else int(32.43) returns the integer 32. len("Hello") returns the integer 5. methods are similar to functions, but directly associated with an object. "SCREAM".lower() returns the string "scream" Both functions and methods end with (). Between the (), arguments can (sometimes have to) be supplied. Big Data and Automated Content Analysis Damian Trilling
  • 14. Indention: The Python way of structuring your program
  • 15. Basics Exercise Next meetings Indention Indention Structure The program is structured by TABs or SPACEs 1 firstnames=[’Damian’,’Lori’,’Bjoern’] 2 age={’Bjoern’: 27, ’Damian’: 32, ’Lori’: 26} 3 print ("The names and ages of all BigData people:") 4 for naam in firstnames: 5 print (naam,age[naam]) Big Data and Automated Content Analysis Damian Trilling
  • 16. Basics Exercise Next meetings Indention Indention Structure The program is structured by TABs or SPACEs 1 firstnames=[’Damian’,’Lori’,’Bjoern’] 2 age={’Bjoern’: 27, ’Damian’: 32, ’Lori’: 26} 3 print ("The names and ages of all BigData people:") 4 for naam in firstnames: 5 print (naam,age[naam]) Don’t mix up TABs and spaces! Both are valid, but you have to be consequent!!! Big Data and Automated Content Analysis Damian Trilling
  • 17. Basics Exercise Next meetings Indention Indention Structure The program is structured by TABs or SPACEs 1 print ("The names and ages of all BigData people:") 2 for naam in firstnames: 3 print (naam,age[naam]) 4 if naam=="Damian": 5 print ("He teaches this course") 6 elif naam=="Lori": 7 print ("She was an assistant last year") 8 elif naam=="Bjoern": 9 print ("He helps on Wednesdays") 10 else: 11 print ("No idea who this is") Big Data and Automated Content Analysis Damian Trilling
  • 18. Basics Exercise Next meetings Indention Indention The line before an indented block starts with a statement indicating what should be done with the block and ends with a : Big Data and Automated Content Analysis Damian Trilling
  • 19. Basics Exercise Next meetings Indention Indention The line before an indented block starts with a statement indicating what should be done with the block and ends with a : Indention of the block indicates that Big Data and Automated Content Analysis Damian Trilling
  • 20. Basics Exercise Next meetings Indention Indention The line before an indented block starts with a statement indicating what should be done with the block and ends with a : Indention of the block indicates that • it is to be executed repeatedly (for statement) – e.g., for each element from a list Big Data and Automated Content Analysis Damian Trilling
  • 21. Basics Exercise Next meetings Indention Indention The line before an indented block starts with a statement indicating what should be done with the block and ends with a : Indention of the block indicates that • it is to be executed repeatedly (for statement) – e.g., for each element from a list • it is only to be executed under specific conditions (if, elif, and else statements) Big Data and Automated Content Analysis Damian Trilling
  • 22. Basics Exercise Next meetings Indention Indention The line before an indented block starts with a statement indicating what should be done with the block and ends with a : Indention of the block indicates that • it is to be executed repeatedly (for statement) – e.g., for each element from a list • it is only to be executed under specific conditions (if, elif, and else statements) • an alternative block should be executed if an error occurs (try and except statements) Big Data and Automated Content Analysis Damian Trilling
  • 23. Basics Exercise Next meetings Indention Indention The line before an indented block starts with a statement indicating what should be done with the block and ends with a : Indention of the block indicates that • it is to be executed repeatedly (for statement) – e.g., for each element from a list • it is only to be executed under specific conditions (if, elif, and else statements) • an alternative block should be executed if an error occurs (try and except statements) • a file is opened, but should be closed again after the block has been executed (with statement) Big Data and Automated Content Analysis Damian Trilling
  • 24. Basics Exercise Next meetings We’ll now together do the exercise “Describing an existing structured dataset”. Big Data and Automated Content Analysis Damian Trilling
  • 25. Basics Exercise Next meetings Next meetings Big Data and Automated Content Analysis Damian Trilling
  • 26. Basics Exercise Next meetings Week 3: Data harvesting and storage Monday, 13–4 A conceptual overview of APIs, scrapers, crawlers, RSS-feeds, databases, and different file formats Wednesday, 15–4 Writing some first data collection scripts Preparation • Conceptual level: Read the article by Morstatter, Pfeffer, Liu, and Carley (2013) about the limitations of the Twitter API. • Technical level: Make sure you are comfortable with the techniques we’ve covered so far. Play around. Give yourself some tasks and solve them. Google. Big Data and Automated Content Analysis Damian Trilling