Python for Data Analysis Data Wrangling with Pandas NumPy and IPython Wes Mckinney
Python for Data Analysis Data Wrangling with Pandas NumPy and IPython Wes Mckinney
Python for Data Analysis Data Wrangling with Pandas NumPy and IPython Wes Mckinney
Python for Data Analysis Data Wrangling with Pandas NumPy and IPython Wes Mckinney
1. Python for Data Analysis Data Wrangling with
Pandas NumPy and IPython Wes Mckinney download
https://textbookfull.com/product/python-for-data-analysis-data-
wrangling-with-pandas-numpy-and-ipython-wes-mckinney/
Download more ebook from https://textbookfull.com
2. We believe these products will be a great fit for you. Click
the link to download now, or visit textbookfull.com
to discover even more!
Python for Data Analysis Data Wrangling with pandas
NumPy and Jupyter 3rd Edition Wes Mckinney
https://textbookfull.com/product/python-for-data-analysis-data-
wrangling-with-pandas-numpy-and-jupyter-3rd-edition-wes-mckinney/
Python Data Analytics: With Pandas, NumPy, and
Matplotlib Nelli
https://textbookfull.com/product/python-data-analytics-with-
pandas-numpy-and-matplotlib-nelli/
Python Data Analytics with Pandas, NumPy and
Matplotlib, 2nd Edition Fabio Nelli
https://textbookfull.com/product/python-data-analytics-with-
pandas-numpy-and-matplotlib-2nd-edition-fabio-nelli/
Python Data Analysis: Perform data collection, data
processing, wrangling, visualization, and model
building using Python 3rd Edition Avinash Navlani
https://textbookfull.com/product/python-data-analysis-perform-
data-collection-data-processing-wrangling-visualization-and-
model-building-using-python-3rd-edition-avinash-navlani/
3. Learning the Pandas Library Python Tools for Data
Munging Analysis and Visual Matt Harrison
https://textbookfull.com/product/learning-the-pandas-library-
python-tools-for-data-munging-analysis-and-visual-matt-harrison/
Data Wrangling with JavaScript 1st Edition Ashley Davis
https://textbookfull.com/product/data-wrangling-with-
javascript-1st-edition-ashley-davis/
Data Analysis from Scratch with Python Peters Morgan
https://textbookfull.com/product/data-analysis-from-scratch-with-
python-peters-morgan/
A Python Data Analyst’s Toolkit: Learn Python and
Python-based Libraries with Applications in Data
Analysis and Statistics Gayathri Rajagopalan
https://textbookfull.com/product/a-python-data-analysts-toolkit-
learn-python-and-python-based-libraries-with-applications-in-
data-analysis-and-statistics-gayathri-rajagopalan/
Data Analysis with Python and PySpark (MEAP V07)
Jonathan Rioux
https://textbookfull.com/product/data-analysis-with-python-and-
pyspark-meap-v07-jonathan-rioux/
4. Wes McKinney
Python for Data Analysis
Data Wrangling with Pandas, NumPy,
and IPython
SECOND EDITION
6. Preface. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi
1. Preliminaries. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 What Is This Book About? 1
What Kinds of Data? 1
1.2 Why Python for Data Analysis? 2
Python as Glue 2
Solving the “Two-Language” Problem 3
Why Not Python? 3
1.3 Essential Python Libraries 4
NumPy 4
pandas 4
matplotlib 5
IPython and Jupyter 6
SciPy 6
scikit-learn 7
statsmodels 8
1.4 Installation and Setup 8
Windows 9
Apple (OS X, macOS) 9
GNU/Linux 9
Installing or Updating Python Packages 10
Python 2 and Python 3 11
Integrated Development Environments (IDEs) and Text Editors 11
1.5 Community and Conferences 12
1.6 Navigating This Book 12
Code Examples 13
Data for Examples 13
Contents
7. Import Conventions 14
Jargon 14
2. Python Language Basics, IPython, and Jupyter Notebooks. . . . . . . . . . . . . . . . . . . . . . . . 15
2.1 The Python Interpreter 16
2.2 IPython Basics 17
Running the IPython Shell 17
Running the Jupyter Notebook 18
Tab Completion 21
Introspection 23
The %run Command 25
Executing Code from the Clipboard 26
Terminal Keyboard Shortcuts 27
About Magic Commands 28
Matplotlib Integration 29
2.3 Python Language Basics 30
Language Semantics 30
Scalar Types 38
Control Flow 46
3. Built-in Data Structures, Functions, and Files. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
3.1 Data Structures and Sequences 51
Tuple 51
List 54
Built-in Sequence Functions 59
dict 61
set 65
List, Set, and Dict Comprehensions 67
3.2 Functions 69
Namespaces, Scope, and Local Functions 70
Returning Multiple Values 71
Functions Are Objects 72
Anonymous (Lambda) Functions 73
Currying: Partial Argument Application 74
Generators 75
Errors and Exception Handling 77
3.3 Files and the Operating System 80
Bytes and Unicode with Files 83
3.4 Conclusion 84
4. NumPy Basics: Arrays and Vectorized Computation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
4.1 The NumPy ndarray: A Multidimensional Array Object 87
8. Creating ndarrays 88
Data Types for ndarrays 90
Arithmetic with NumPy Arrays 93
Basic Indexing and Slicing 94
Boolean Indexing 99
Fancy Indexing 102
Transposing Arrays and Swapping Axes 103
4.2 Universal Functions: Fast Element-Wise Array Functions 105
4.3 Array-Oriented Programming with Arrays 108
Expressing Conditional Logic as Array Operations 109
Mathematical and Statistical Methods 111
Methods for Boolean Arrays 113
Sorting 113
Unique and Other Set Logic 114
4.4 File Input and Output with Arrays 115
4.5 Linear Algebra 116
4.6 Pseudorandom Number Generation 118
4.7 Example: Random Walks 119
Simulating Many Random Walks at Once 121
4.8 Conclusion 122
5. Getting Started with pandas. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
5.1 Introduction to pandas Data Structures 124
Series 124
DataFrame 128
Index Objects 134
5.2 Essential Functionality 136
Reindexing 136
Dropping Entries from an Axis 138
Indexing, Selection, and Filtering 140
Integer Indexes 145
Arithmetic and Data Alignment 146
Function Application and Mapping 151
Sorting and Ranking 153
Axis Indexes with Duplicate Labels 157
5.3 Summarizing and Computing Descriptive Statistics 158
Correlation and Covariance 160
Unique Values, Value Counts, and Membership 162
5.4 Conclusion 165
6. Data Loading, Storage, and File Formats. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167
6.1 Reading and Writing Data in Text Format 167
9. Reading Text Files in Pieces 173
Writing Data to Text Format 175
Working with Delimited Formats 176
JSON Data 178
XML and HTML: Web Scraping 180
6.2 Binary Data Formats 183
Using HDF5 Format 184
Reading Microsoft Excel Files 186
6.3 Interacting with Web APIs 187
6.4 Interacting with Databases 188
6.5 Conclusion 190
7. Data Cleaning and Preparation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191
7.1 Handling Missing Data 191
Filtering Out Missing Data 193
Filling In Missing Data 195
7.2 Data Transformation 197
Removing Duplicates 197
Transforming Data Using a Function or Mapping 198
Replacing Values 200
Renaming Axis Indexes 201
Discretization and Binning 203
Detecting and Filtering Outliers 205
Permutation and Random Sampling 206
Computing Indicator/Dummy Variables 208
7.3 String Manipulation 211
String Object Methods 211
Regular Expressions 213
Vectorized String Functions in pandas 216
7.4 Conclusion 219
8. Data Wrangling: Join, Combine, and Reshape. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221
8.1 Hierarchical Indexing 221
Reordering and Sorting Levels 224
Summary Statistics by Level 225
Indexing with a DataFrame’s columns 225
8.2 Combining and Merging Datasets 227
Database-Style DataFrame Joins 227
Merging on Index 232
Concatenating Along an Axis 236
Combining Data with Overlap 241
8.3 Reshaping and Pivoting 242
10. Reshaping with Hierarchical Indexing 243
Pivoting “Long” to “Wide” Format 246
Pivoting “Wide” to “Long” Format 249
8.4 Conclusion 251
9. Plotting and Visualization. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253
9.1 A Brief matplotlib API Primer 253
Figures and Subplots 255
Colors, Markers, and Line Styles 259
Ticks, Labels, and Legends 261
Annotations and Drawing on a Subplot 265
Saving Plots to File 267
matplotlib Configuration 268
9.2 Plotting with pandas and seaborn 268
Line Plots 269
Bar Plots 272
Histograms and Density Plots 277
Scatter or Point Plots 280
Facet Grids and Categorical Data 283
9.3 Other Python Visualization Tools 285
9.4 Conclusion 286
10. Data Aggregation and Group Operations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 287
10.1 GroupBy Mechanics 288
Iterating Over Groups 291
Selecting a Column or Subset of Columns 293
Grouping with Dicts and Series 294
Grouping with Functions 295
Grouping by Index Levels 295
10.2 Data Aggregation 296
Column-Wise and Multiple Function Application 298
Returning Aggregated Data Without Row Indexes 301
10.3 Apply: General split-apply-combine 302
Suppressing the Group Keys 304
Quantile and Bucket Analysis 305
Example: Filling Missing Values with Group-Specific Values 306
Example: Random Sampling and Permutation 308
Example: Group Weighted Average and Correlation 310
Example: Group-Wise Linear Regression 312
10.4 Pivot Tables and Cross-Tabulation 313
Cross-Tabulations: Crosstab 315
10.5 Conclusion 316
11. 11. Time Series. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 317
11.1 Date and Time Data Types and Tools 318
Converting Between String and Datetime 319
11.2 Time Series Basics 322
Indexing, Selection, Subsetting 323
Time Series with Duplicate Indices 326
11.3 Date Ranges, Frequencies, and Shifting 327
Generating Date Ranges 328
Frequencies and Date Offsets 330
Shifting (Leading and Lagging) Data 332
11.4 Time Zone Handling 335
Time Zone Localization and Conversion 335
Operations with Time Zone−Aware Timestamp Objects 338
Operations Between Different Time Zones 339
11.5 Periods and Period Arithmetic 339
Period Frequency Conversion 340
Quarterly Period Frequencies 342
Converting Timestamps to Periods (and Back) 344
Creating a PeriodIndex from Arrays 345
11.6 Resampling and Frequency Conversion 348
Downsampling 349
Upsampling and Interpolation 352
Resampling with Periods 353
11.7 Moving Window Functions 354
Exponentially Weighted Functions 358
Binary Moving Window Functions 359
User-Defined Moving Window Functions 361
11.8 Conclusion 362
12. Advanced pandas. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 363
12.1 Categorical Data 363
Background and Motivation 363
Categorical Type in pandas 365
Computations with Categoricals 367
Categorical Methods 370
12.2 Advanced GroupBy Use 373
Group Transforms and “Unwrapped” GroupBys 373
Grouped Time Resampling 377
12.3 Techniques for Method Chaining 378
The pipe Method 380
12.4 Conclusion 381
12. 13. Introduction to Modeling Libraries in Python. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 383
13.1 Interfacing Between pandas and Model Code 383
13.2 Creating Model Descriptions with Patsy 386
Data Transformations in Patsy Formulas 389
Categorical Data and Patsy 390
13.3 Introduction to statsmodels 393
Estimating Linear Models 393
Estimating Time Series Processes 396
13.4 Introduction to scikit-learn 397
13.5 Continuing Your Education 401
14. Data Analysis Examples. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 403
14.1 1.USA.gov Data from Bitly 403
Counting Time Zones in Pure Python 404
Counting Time Zones with pandas 406
14.2 MovieLens 1M Dataset 413
Measuring Rating Disagreement 418
14.3 US Baby Names 1880–2010 419
Analyzing Naming Trends 425
14.4 USDA Food Database 434
14.5 2012 Federal Election Commission Database 440
Donation Statistics by Occupation and Employer 442
Bucketing Donation Amounts 445
Donation Statistics by State 447
14.6 Conclusion 448
A. Advanced NumPy. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 449
A.1 ndarray Object Internals 449
NumPy dtype Hierarchy 450
A.2 Advanced Array Manipulation 451
Reshaping Arrays 452
C Versus Fortran Order 454
Concatenating and Splitting Arrays 454
Repeating Elements: tile and repeat 457
Fancy Indexing Equivalents: take and put 459
A.3 Broadcasting 460
Broadcasting Over Other Axes 462
Setting Array Values by Broadcasting 465
A.4 Advanced ufunc Usage 466
ufunc Instance Methods 466
Writing New ufuncs in Python 468
A.5 Structured and Record Arrays 469
13. Nested dtypes and Multidimensional Fields 469
Why Use Structured Arrays? 470
A.6 More About Sorting 471
Indirect Sorts: argsort and lexsort 472
Alternative Sort Algorithms 474
Partially Sorting Arrays 474
numpy.searchsorted: Finding Elements in a Sorted Array 475
A.7 Writing Fast NumPy Functions with Numba 476
Creating Custom numpy.ufunc Objects with Numba 478
A.8 Advanced Array Input and Output 478
Memory-Mapped Files 478
HDF5 and Other Array Storage Options 480
A.9 Performance Tips 480
The Importance of Contiguous Memory 480
B. More on the IPython System. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 483
B.1 Using the Command History 483
Searching and Reusing the Command History 483
Input and Output Variables 484
B.2 Interacting with the Operating System 485
Shell Commands and Aliases 486
Directory Bookmark System 487
B.3 Software Development Tools 487
Interactive Debugger 488
Timing Code: %time and %timeit 492
Basic Profiling: %prun and %run -p 494
Profiling a Function Line by Line 496
B.4 Tips for Productive Code Development Using IPython 498
Reloading Module Dependencies 498
Code Design Tips 499
B.5 Advanced IPython Features 500
Making Your Own Classes IPython-Friendly 500
Profiles and Configuration 501
B.6 Conclusion 503
Index. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 505
14. Preface
New for the Second Edition
The first edition of this book was published in 2012, during a time when open source
data analysis libraries for Python (such as pandas) were very new and developing rap‐
idly. In this updated and expanded second edition, I have overhauled the chapters to
account both for incompatible changes and deprecations as well as new features that
have occurred in the last five years. I’ve also added fresh content to introduce tools
that either did not exist in 2012 or had not matured enough to make the first cut.
Finally, I have tried to avoid writing about new or cutting-edge open source projects
that may not have had a chance to mature. I would like readers of this edition to find
that the content is still almost as relevant in 2020 or 2021 as it is in 2017.
The major updates in this second edition include:
• All code, including the Python tutorial, updated for Python 3.6 (the first edition
used Python 2.7)
• Updated Python installation instructions for the Anaconda Python Distribution
and other needed Python packages
• Updates for the latest versions of the pandas library in 2017
• A new chapter on some more advanced pandas tools, and some other usage tips
• A brief introduction to using statsmodels and scikit-learn
I also reorganized a significant portion of the content from the first edition to make
the book more accessible to newcomers.
15. Conventions Used in This Book
The following typographical conventions are used in this book:
Italic
Indicates new terms, URLs, email addresses, filenames, and file extensions.
Constant width
Used for program listings, as well as within paragraphs to refer to program ele‐
ments such as variable or function names, databases, data types, environment
variables, statements, and keywords.
Constant width bold
Shows commands or other text that should be typed literally by the user.
Constant width italic
Shows text that should be replaced with user-supplied values or by values deter‐
mined by context.
This element signifies a tip or suggestion.
This element signifies a general note.
This element indicates a warning or caution.
Using Code Examples
You can find data files and related material for each chapter is available in this book’s
GitHub repository at http://github.com/wesm/pydata-book.
This book is here to help you get your job done. In general, if example code is offered
with this book, you may use it in your programs and documentation. You do not
need to contact us for permission unless you’re reproducing a significant portion of
the code. For example, writing a program that uses several chunks of code from this
16. book does not require permission. Selling or distributing a CD-ROM of examples
from O’Reilly books does require permission. Answering a question by citing this
book and quoting example code does not require permission. Incorporating a signifi‐
cant amount of example code from this book into your product’s documentation does
require permission.
We appreciate, but do not require, attribution. An attribution usually includes the
title, author, publisher, and ISBN. For example: “Python for Data Analysis by Wes
McKinney (O’Reilly). Copyright 2017 Wes McKinney, 978-1-491-95766-0.”
If you feel your use of code examples falls outside fair use or the permission given
above, feel free to contact us at [email protected].
17. CHAPTER 1
Preliminaries
1.1 What Is This Book About?
This book is concerned with the nuts and bolts of manipulating, processing, cleaning,
and crunching data in Python. My goal is to offer a guide to the parts of the Python
programming language and its data-oriented library ecosystem and tools that will
equip you to become an effective data analyst. While “data analysis” is in the title of
the book, the focus is specifically on Python programming, libraries, and tools as
opposed to data analysis methodology. This is the Python programming you need for
data analysis.
What Kinds of Data?
When I say “data,” what am I referring to exactly? The primary focus is on structured
data, a deliberately vague term that encompasses many different common forms of
data, such as:
• Tabular or spreadsheet-like data in which each column may be a different type
(string, numeric, date, or otherwise). This includes most kinds of data commonly
stored in relational databases or tab- or comma-delimited text files.
• Multidimensional arrays (matrices).
• Multiple tables of data interrelated by key columns (what would be primary or
foreign keys for a SQL user).
• Evenly or unevenly spaced time series.
This is by no means a complete list. Even though it may not always be obvious, a large
percentage of datasets can be transformed into a structured form that is more suitable
for analysis and modeling. If not, it may be possible to extract features from a dataset
1
18. into a structured form. As an example, a collection of news articles could be pro‐
cessed into a word frequency table, which could then be used to perform sentiment
analysis.
Most users of spreadsheet programs like Microsoft Excel, perhaps the most widely
used data analysis tool in the world, will not be strangers to these kinds of data.
1.2 Why Python for Data Analysis?
For many people, the Python programming language has strong appeal. Since its first
appearance in 1991, Python has become one of the most popular interpreted pro‐
gramming languages, along with Perl, Ruby, and others. Python and Ruby have
become especially popular since 2005 or so for building websites using their numer‐
ous web frameworks, like Rails (Ruby) and Django (Python). Such languages are
often called scripting languages, as they can be used to quickly write small programs,
or scripts to automate other tasks. I don’t like the term “scripting language,” as it car‐
ries a connotation that they cannot be used for building serious software. Among
interpreted languages, for various historical and cultural reasons, Python has devel‐
oped a large and active scientific computing and data analysis community. In the last
10 years, Python has gone from a bleeding-edge or “at your own risk” scientific com‐
puting language to one of the most important languages for data science, machine
learning, and general software development in academia and industry.
For data analysis and interactive computing and data visualization, Python will inevi‐
tably draw comparisons with other open source and commercial programming lan‐
guages and tools in wide use, such as R, MATLAB, SAS, Stata, and others. In recent
years, Python’s improved support for libraries (such as pandas and scikit-learn) has
made it a popular choice for data analysis tasks. Combined with Python’s overall
strength for general-purpose software engineering, it is an excellent option as a pri‐
mary language for building data applications.
Python as Glue
Part of Python’s success in scientific computing is the ease of integrating C, C++, and
FORTRAN code. Most modern computing environments share a similar set of legacy
FORTRAN and C libraries for doing linear algebra, optimization, integration, fast
Fourier transforms, and other such algorithms. The same story has held true for
many companies and national labs that have used Python to glue together decades’
worth of legacy software.
Many programs consist of small portions of code where most of the time is spent,
with large amounts of “glue code” that doesn’t run often. In many cases, the execution
time of the glue code is insignificant; effort is most fruitfully invested in optimizing
2 | Chapter 1: Preliminaries
19. the computational bottlenecks, sometimes by moving the code to a lower-level lan‐
guage like C.
Solving the “Two-Language” Problem
In many organizations, it is common to research, prototype, and test new ideas using
a more specialized computing language like SAS or R and then later port those ideas
to be part of a larger production system written in, say, Java, C#, or C++. What people
are increasingly finding is that Python is a suitable language not only for doing
research and prototyping but also for building the production systems. Why main‐
tain two development environments when one will suffice? I believe that more and
more companies will go down this path, as there are often significant organizational
benefits to having both researchers and software engineers using the same set of pro‐
gramming tools.
Why Not Python?
While Python is an excellent environment for building many kinds of analytical
applications and general-purpose systems, there are a number of uses for which
Python may be less suitable.
As Python is an interpreted programming language, in general most Python code will
run substantially slower than code written in a compiled language like Java or C++.
As programmer time is often more valuable than CPU time, many are happy to make
this trade-off. However, in an application with very low latency or demanding
resource utilization requirements (e.g., a high-frequency trading system), the time
spent programming in a lower-level (but also lower-productivity) language like C++
to achieve the maximum possible performance might be time well spent.
Python can be a challenging language for building highly concurrent, multithreaded
applications, particularly applications with many CPU-bound threads. The reason for
this is that it has what is known as the global interpreter lock (GIL), a mechanism that
prevents the interpreter from executing more than one Python instruction at a time.
The technical reasons for why the GIL exists are beyond the scope of this book. While
it is true that in many big data processing applications, a cluster of computers may be
required to process a dataset in a reasonable amount of time, there are still situations
where a single-process, multithreaded system is desirable.
This is not to say that Python cannot execute truly multithreaded, parallel code.
Python C extensions that use native multithreading (in C or C++) can run code in
parallel without being impacted by the GIL, so long as they do not need to regularly
interact with Python objects.
1.2 Why Python for Data Analysis? | 3
20. 1.3 Essential Python Libraries
For those who are less familiar with the Python data ecosystem and the libraries used
throughout the book, I will give a brief overview of some of them.
NumPy
NumPy, short for Numerical Python, has long been a cornerstone of numerical com‐
puting in Python. It provides the data structures, algorithms, and library glue needed
for most scientific applications involving numerical data in Python. NumPy contains,
among other things:
• A fast and efficient multidimensional array object ndarray
• Functions for performing element-wise computations with arrays or mathemati‐
cal operations between arrays
• Tools for reading and writing array-based datasets to disk
• Linear algebra operations, Fourier transform, and random number generation
• A mature C API to enable Python extensions and native C or C++ code to access
NumPy’s data structures and computational facilities
Beyond the fast array-processing capabilities that NumPy adds to Python, one of its
primary uses in data analysis is as a container for data to be passed between algo‐
rithms and libraries. For numerical data, NumPy arrays are more efficient for storing
and manipulating data than the other built-in Python data structures. Also, libraries
written in a lower-level language, such as C or Fortran, can operate on the data stored
in a NumPy array without copying data into some other memory representation.
Thus, many numerical computing tools for Python either assume NumPy arrays as a
primary data structure or else target seamless interoperability with NumPy.
pandas
pandas provides high-level data structures and functions designed to make working
with structured or tabular data fast, easy, and expressive. Since its emergence in 2010,
it has helped enable Python to be a powerful and productive data analysis environ‐
ment. The primary objects in pandas that will be used in this book are the DataFrame,
a tabular, column-oriented data structure with both row and column labels, and the
Series, a one-dimensional labeled array object.
pandas blends the high-performance, array-computing ideas of NumPy with the flex‐
ible data manipulation capabilities of spreadsheets and relational databases (such as
SQL). It provides sophisticated indexing functionality to make it easy to reshape, slice
and dice, perform aggregations, and select subsets of data. Since data manipulation,
4 | Chapter 1: Preliminaries
21. preparation, and cleaning is such an important skill in data analysis, pandas is one of
the primary focuses of this book.
As a bit of background, I started building pandas in early 2008 during my tenure at
AQR Capital Management, a quantitative investment management firm. At the time,
I had a distinct set of requirements that were not well addressed by any single tool at
my disposal:
• Data structures with labeled axes supporting automatic or explicit data alignment
—this prevents common errors resulting from misaligned data and working with
differently indexed data coming from different sources
• Integrated time series functionality
• The same data structures handle both time series data and non–time series data
• Arithmetic operations and reductions that preserve metadata
• Flexible handling of missing data
• Merge and other relational operations found in popular databases (SQL-based,
for example)
I wanted to be able to do all of these things in one place, preferably in a language well
suited to general-purpose software development. Python was a good candidate lan‐
guage for this, but at that time there was not an integrated set of data structures and
tools providing this functionality. As a result of having been built initially to solve
finance and business analytics problems, pandas features especially deep time series
functionality and tools well suited for working with time-indexed data generated by
business processes.
For users of the R language for statistical computing, the DataFrame name will be
familiar, as the object was named after the similar R data.frame object. Unlike
Python, data frames are built into the R programming language and its standard
library. As a result, many features found in pandas are typically either part of the R
core implementation or provided by add-on packages.
The pandas name itself is derived from panel data, an econometrics term for multidi‐
mensional structured datasets, and a play on the phrase Python data analysis itself.
matplotlib
matplotlib is the most popular Python library for producing plots and other two-
dimensional data visualizations. It was originally created by John D. Hunter and is
now maintained by a large team of developers. It is designed for creating plots suit‐
able for publication. While there are other visualization libraries available to Python
programmers, matplotlib is the most widely used and as such has generally good inte‐
1.3 Essential Python Libraries | 5
22. gration with the rest of the ecosystem. I think it is a safe choice as a default visualiza‐
tion tool.
IPython and Jupyter
The IPython project began in 2001 as Fernando Pérez’s side project to make a better
interactive Python interpreter. In the subsequent 16 years it has become one of the
most important tools in the modern Python data stack. While it does not provide any
computational or data analytical tools by itself, IPython is designed from the ground
up to maximize your productivity in both interactive computing and software devel‐
opment. It encourages an execute-explore workflow instead of the typical edit-compile-
run workflow of many other programming languages. It also provides easy access to
your operating system’s shell and filesystem. Since much of data analysis coding
involves exploration, trial and error, and iteration, IPython can help you get the job
done faster.
In 2014, Fernando and the IPython team announced the Jupyter project, a broader
initiative to design language-agnostic interactive computing tools. The IPython web
notebook became the Jupyter notebook, with support now for over 40 programming
languages. The IPython system can now be used as a kernel (a programming language
mode) for using Python with Jupyter.
IPython itself has become a component of the much broader Jupyter open source
project, which provides a productive environment for interactive and exploratory
computing. Its oldest and simplest “mode” is as an enhanced Python shell designed to
accelerate the writing, testing, and debugging of Python code. You can also use the
IPython system through the Jupyter Notebook, an interactive web-based code “note‐
book” offering support for dozens of programming languages. The IPython shell and
Jupyter notebooks are especially useful for data exploration and visualization.
The Jupyter notebook system also allows you to author content in Markdown and
HTML, providing you a means to create rich documents with code and text. Other
programming languages have also implemented kernels for Jupyter to enable you to
use languages other than Python in Jupyter.
For me personally, IPython is usually involved with the majority of my Python work,
including running, debugging, and testing code.
In the accompanying book materials, you will find Jupyter notebooks containing all
the code examples from each chapter.
SciPy
SciPy is a collection of packages addressing a number of different standard problem
domains in scientific computing. Here is a sampling of the packages included:
6 | Chapter 1: Preliminaries
23. scipy.integrate
Numerical integration routines and differential equation solvers
scipy.linalg
Linear algebra routines and matrix decompositions extending beyond those pro‐
vided in numpy.linalg
scipy.optimize
Function optimizers (minimizers) and root finding algorithms
scipy.signal
Signal processing tools
scipy.sparse
Sparse matrices and sparse linear system solvers
scipy.special
Wrapper around SPECFUN, a Fortran library implementing many common
mathematical functions, such as the gamma function
scipy.stats
Standard continuous and discrete probability distributions (density functions,
samplers, continuous distribution functions), various statistical tests, and more
descriptive statistics
Together NumPy and SciPy form a reasonably complete and mature computational
foundation for many traditional scientific computing applications.
scikit-learn
Since the project’s inception in 2010, scikit-learn has become the premier general-
purpose machine learning toolkit for Python programmers. In just seven years, it has
had over 1,500 contributors from around the world. It includes submodules for such
models as:
• Classification: SVM, nearest neighbors, random forest, logistic regression, etc.
• Regression: Lasso, ridge regression, etc.
• Clustering: k-means, spectral clustering, etc.
• Dimensionality reduction: PCA, feature selection, matrix factorization, etc.
• Model selection: Grid search, cross-validation, metrics
• Preprocessing: Feature extraction, normalization
Along with pandas, statsmodels, and IPython, scikit-learn has been critical for ena‐
bling Python to be a productive data science programming language. While I won’t
1.3 Essential Python Libraries | 7
24. be able to include a comprehensive guide to scikit-learn in this book, I will give a
brief introduction to some of its models and how to use them with the other tools
presented in the book.
statsmodels
statsmodels is a statistical analysis package that was seeded by work from Stanford
University statistics professor Jonathan Taylor, who implemented a number of regres‐
sion analysis models popular in the R programming language. Skipper Seabold and
Josef Perktold formally created the new statsmodels project in 2010 and since then
have grown the project to a critical mass of engaged users and contributors. Nathaniel
Smith developed the Patsy project, which provides a formula or model specification
framework for statsmodels inspired by R’s formula system.
Compared with scikit-learn, statsmodels contains algorithms for classical (primarily
frequentist) statistics and econometrics. This includes such submodules as:
• Regression models: Linear regression, generalized linear models, robust linear
models, linear mixed effects models, etc.
• Analysis of variance (ANOVA)
• Time series analysis: AR, ARMA, ARIMA, VAR, and other models
• Nonparametric methods: Kernel density estimation, kernel regression
• Visualization of statistical model results
statsmodels is more focused on statistical inference, providing uncertainty estimates
and p-values for parameters. scikit-learn, by contrast, is more prediction-focused.
As with scikit-learn, I will give a brief introduction to statsmodels and how to use it
with NumPy and pandas.
1.4 Installation and Setup
Since everyone uses Python for different applications, there is no single solution for
setting up Python and required add-on packages. Many readers will not have a com‐
plete Python development environment suitable for following along with this book,
so here I will give detailed instructions to get set up on each operating system. I rec‐
ommend using the free Anaconda distribution. At the time of this writing, Anaconda
is offered in both Python 2.7 and 3.6 forms, though this might change at some point
in the future. This book uses Python 3.6, and I encourage you to use Python 3.6 or
higher.
8 | Chapter 1: Preliminaries
25. Windows
To get started on Windows, download the Anaconda installer. I recommend follow‐
ing the installation instructions for Windows available on the Anaconda download
page, which may have changed between the time this book was published and when
you are reading this.
Now, let’s verify that things are configured correctly. To open the Command Prompt
application (also known as cmd.exe), right-click the Start menu and select Command
Prompt. Try starting the Python interpreter by typing python. You should see a mes‐
sage that matches the version of Anaconda you installed:
C:Userswesm>python
Python 3.5.2 |Anaconda 4.1.1 (64-bit)| (default, Jul 5 2016, 11:41:13)
[MSC v.1900 64 bit (AMD64)] on win32
>>>
To exit the shell, press Ctrl-D (on Linux or macOS), Ctrl-Z (on Windows), or type
the command exit() and press Enter.
Apple (OS X, macOS)
Download the OS X Anaconda installer, which should be named something like
Anaconda3-4.1.0-MacOSX-x86_64.pkg. Double-click the .pkg file to run the installer.
When the installer runs, it automatically appends the Anaconda executable path to
your .bash_profile file. This is located at /Users/$USER/.bash_profile.
To verify everything is working, try launching IPython in the system shell (open the
Terminal application to get a command prompt):
$ ipython
To exit the shell, press Ctrl-D or type exit() and press Enter.
GNU/Linux
Linux details will vary a bit depending on your Linux flavor, but here I give details for
such distributions as Debian, Ubuntu, CentOS, and Fedora. Setup is similar to OS X
with the exception of how Anaconda is installed. The installer is a shell script that
must be executed in the terminal. Depending on whether you have a 32-bit or 64-bit
system, you will either need to install the x86 (32-bit) or x86_64 (64-bit) installer. You
will then have a file named something similar to Anaconda3-4.1.0-Linux-x86_64.sh.
To install it, execute this script with bash:
$ bash Anaconda3-4.1.0-Linux-x86_64.sh
1.4 Installation and Setup | 9
26. Some Linux distributions have versions of all the required Python
packages in their package managers and can be installed using a
tool like apt. The setup described here uses Anaconda, as it’s both
easily reproducible across distributions and simpler to upgrade
packages to their latest versions.
After accepting the license, you will be presented with a choice of where to put the
Anaconda files. I recommend installing the files in the default location in your home
directory—for example, /home/$USER/anaconda (with your username, naturally).
The Anaconda installer may ask if you wish to prepend its bin/ directory to your
$PATH variable. If you have any problems after installation, you can do this yourself by
modifying your .bashrc (or .zshrc, if you are using the zsh shell) with something akin
to:
export PATH=/home/$USER/anaconda/bin:$PATH
After doing this you can either start a new terminal process or execute your .bashrc
again with source ~/.bashrc.
Installing or Updating Python Packages
At some point while reading, you may wish to install additional Python packages that
are not included in the Anaconda distribution. In general, these can be installed with
the following command:
conda install package_name
If this does not work, you may also be able to install the package using the pip pack‐
age management tool:
pip install package_name
You can update packages by using the conda update command:
conda update package_name
pip also supports upgrades using the --upgrade flag:
pip install --upgrade package_name
You will have several opportunities to try out these commands throughout the book.
While you can use both conda and pip to install packages, you
should not attempt to update conda packages with pip, as doing so
can lead to environment problems. When using Anaconda or Min‐
iconda, it’s best to first try updating with conda.
10 | Chapter 1: Preliminaries
27. Python 2 and Python 3
The first version of the Python 3.x line of interpreters was released at the end of 2008.
It included a number of changes that made some previously written Python 2.x code
incompatible. Because 17 years had passed since the very first release of Python in
1991, creating a “breaking” release of Python 3 was viewed to be for the greater good
given the lessons learned during that time.
In 2012, much of the scientific and data analysis community was still using Python
2.x because many packages had not been made fully Python 3 compatible. Thus, the
first edition of this book used Python 2.7. Now, users are free to choose between
Python 2.x and 3.x and in general have full library support with either flavor.
However, Python 2.x will reach its development end of life in 2020 (including critical
security patches), and so it is no longer a good idea to start new projects in Python
2.7. Therefore, this book uses Python 3.6, a widely deployed, well-supported stable
release. We have begun to call Python 2.x “Legacy Python” and Python 3.x simply
“Python.” I encourage you to do the same.
This book uses Python 3.6 as its basis. Your version of Python may be newer than 3.6,
but the code examples should be forward compatible. Some code examples may work
differently or not at all in Python 2.7.
Integrated Development Environments (IDEs) and Text Editors
When asked about my standard development environment, I almost always say “IPy‐
thon plus a text editor.” I typically write a program and iteratively test and debug each
piece of it in IPython or Jupyter notebooks. It is also useful to be able to play around
with data interactively and visually verify that a particular set of data manipulations is
doing the right thing. Libraries like pandas and NumPy are designed to be easy to use
in the shell.
When building software, however, some users may prefer to use a more richly fea‐
tured IDE rather than a comparatively primitive text editor like Emacs or Vim. Here
are some that you can explore:
• PyDev (free), an IDE built on the Eclipse platform
• PyCharm from JetBrains (subscription-based for commercial users, free for open
source developers)
• Python Tools for Visual Studio (for Windows users)
• Spyder (free), an IDE currently shipped with Anaconda
• Komodo IDE (commercial)
1.4 Installation and Setup | 11
28. Due to the popularity of Python, most text editors, like Atom and Sublime Text 2,
have excellent Python support.
1.5 Community and Conferences
Outside of an internet search, the various scientific and data-related Python mailing
lists are generally helpful and responsive to questions. Some to take a look at include:
• pydata: A Google Group list for questions related to Python for data analysis and
pandas
• pystatsmodels: For statsmodels or pandas-related questions
• Mailing list for scikit-learn ([email protected]) and machine learning in
Python, generally
• numpy-discussion: For NumPy-related questions
• scipy-user: For general SciPy or scientific Python questions
I deliberately did not post URLs for these in case they change. They can be easily
located via an internet search.
Each year many conferences are held all over the world for Python programmers. If
you would like to connect with other Python programmers who share your interests,
I encourage you to explore attending one, if possible. Many conferences have finan‐
cial support available for those who cannot afford admission or travel to the confer‐
ence. Here are some to consider:
• PyCon and EuroPython: The two main general Python conferences in North
America and Europe, respectively
• SciPy and EuroSciPy: Scientific-computing-oriented conferences in North Amer‐
ica and Europe, respectively
• PyData: A worldwide series of regional conferences targeted at data science and
data analysis use cases
• International and regional PyCon conferences (see http://pycon.org for a com‐
plete listing)
1.6 Navigating This Book
If you have never programmed in Python before, you will want to spend some time in
Chapters 2 and 3, where I have placed a condensed tutorial on Python language fea‐
tures and the IPython shell and Jupyter notebooks. These things are prerequisite
12 | Chapter 1: Preliminaries
29. knowledge for the remainder of the book. If you have Python experience already, you
may instead choose to skim or skip these chapters.
Next, I give a short introduction to the key features of NumPy, leaving more
advanced NumPy use for Appendix A. Then, I introduce pandas and devote the rest
of the book to data analysis topics applying pandas, NumPy, and matplotlib (for visu‐
alization). I have structured the material in the most incremental way possible,
though there is occasionally some minor cross-over between chapters, with a few iso‐
lated cases where concepts are used that haven’t necessarily been introduced yet.
While readers may have many different end goals for their work, the tasks required
generally fall into a number of different broad groups:
Interacting with the outside world
Reading and writing with a variety of file formats and data stores
Preparation
Cleaning, munging, combining, normalizing, reshaping, slicing and dicing, and
transforming data for analysis
Transformation
Applying mathematical and statistical operations to groups of datasets to derive
new datasets (e.g., aggregating a large table by group variables)
Modeling and computation
Connecting your data to statistical models, machine learning algorithms, or other
computational tools
Presentation
Creating interactive or static graphical visualizations or textual summaries
Code Examples
Most of the code examples in the book are shown with input and output as it would
appear executed in the IPython shell or in Jupyter notebooks:
In [5]: CODE EXAMPLE
Out[5]: OUTPUT
When you see a code example like this, the intent is for you to type in the example
code in the In block in your coding environment and execute it by pressing the Enter
key (or Shift-Enter in Jupyter). You should see output similar to what is shown in the
Out block.
Data for Examples
Datasets for the examples in each chapter are hosted in a GitHub repository. You can
download this data either by using the Git version control system on the command
1.6 Navigating This Book | 13
30. line or by downloading a zip file of the repository from the website. If you run into
problems, navigate to my website for up-to-date instructions about obtaining the
book materials.
I have made every effort to ensure that it contains everything necessary to reproduce
the examples, but I may have made some mistakes or omissions. If so, please send me
an email: [email protected]. The best way to report errors in the book is on the
errata page on the O’Reilly website.
Import Conventions
The Python community has adopted a number of naming conventions for commonly
used modules:
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
import seaborn as sns
import statsmodels as sm
This means that when you see np.arange, this is a reference to the arange function in
NumPy. This is done because it’s considered bad practice in Python software develop‐
ment to import everything (from numpy import *) from a large package like NumPy.
Jargon
I’ll use some terms common both to programming and data science that you may not
be familiar with. Thus, here are some brief definitions:
Munge/munging/wrangling
Describes the overall process of manipulating unstructured and/or messy data
into a structured or clean form. The word has snuck its way into the jargon of
many modern-day data hackers. “Munge” rhymes with “grunge.”
Pseudocode
A description of an algorithm or process that takes a code-like form while likely
not being actual valid source code.
Syntactic sugar
Programming syntax that does not add new features, but makes something more
convenient or easier to type.
14 | Chapter 1: Preliminaries
31. CHAPTER 2
Python Language Basics, IPython, and
Jupyter Notebooks
When I wrote the first edition of this book in 2011 and 2012, there were fewer resour‐
ces available for learning about doing data analysis in Python. This was partially a
chicken-and-egg problem; many libraries that we now take for granted, like pandas,
scikit-learn, and statsmodels, were comparatively immature back then. In 2017, there
is now a growing literature on data science, data analysis, and machine learning, sup‐
plementing the prior works on general-purpose scientific computing geared toward
computational scientists, physicists, and professionals in other research fields. There
are also excellent books about learning the Python programming language itself and
becoming an effective software engineer.
As this book is intended as an introductory text in working with data in Python, I feel
it is valuable to have a self-contained overview of some of the most important fea‐
tures of Python’s built-in data structures and libraries from the perspective of data
manipulation. So, I will only present roughly enough information in this chapter and
Chapter 3 to enable you to follow along with the rest of the book.
In my opinion, it is not necessary to become proficient at building good software in
Python to be able to productively do data analysis. I encourage you to use the IPy‐
thon shell and Jupyter notebooks to experiment with the code examples and to
explore the documentation for the various types, functions, and methods. While I’ve
made best efforts to present the book material in an incremental form, you may occa‐
sionally encounter things that have not yet been fully introduced.
Much of this book focuses on table-based analytics and data preparation tools for
working with large datasets. In order to use those tools you must often first do some
munging to corral messy data into a more nicely tabular (or structured) form. Fortu‐
nately, Python is an ideal language for rapidly whipping your data into shape. The
15
32. greater your facility with Python the language, the easier it will be for you to prepare
new datasets for analysis.
Some of the tools in this book are best explored from a live IPython or Jupyter ses‐
sion. Once you learn how to start up IPython and Jupyter, I recommend that you fol‐
low along with the examples so you can experiment and try different things. As with
any keyboard-driven console-like environment, developing muscle-memory for the
common commands is also part of the learning curve.
There are introductory Python concepts that this chapter does not
cover, like classes and object-oriented programming, which you
may find useful in your foray into data analysis in Python.
To deepen your Python language knowledge, I recommend that
you supplement this chapter with the official Python tutorial and
potentially one of the many excellent books on general-purpose
Python programming. Some recommendations to get you started
include:
• Python Cookbook, Third Edition, by David Beazley and Brian
K. Jones (O’Reilly)
• Fluent Python by Luciano Ramalho (O’Reilly)
• Effective Python by Brett Slatkin (Pearson)
2.1 The Python Interpreter
Python is an interpreted language. The Python interpreter runs a program by execut‐
ing one statement at a time. The standard interactive Python interpreter can be
invoked on the command line with the python command:
$ python
Python 3.6.0 | packaged by conda-forge | (default, Jan 13 2017, 23:17:12)
[GCC 4.8.2 20140120 (Red Hat 4.8.2-15)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> a = 5
>>> print(a)
5
The >>> you see is the prompt where you’ll type code expressions. To exit the Python
interpreter and return to the command prompt, you can either type exit() or press
Ctrl-D.
Running Python programs is as simple as calling python with a .py file as its first
argument. Suppose we had created hello_world.py with these contents:
print('Hello world')
16 | Chapter 2: Python Language Basics, IPython, and Jupyter Notebooks
34. counted one, two, three, four, and up to ten—they are no count, and
the whole crowd made a rush at them, and Lieutenant Chalfant was
knocked down, and the momentum of the crowd carried the crowd
out of sight. They had thrown stones at the heads of them, and
broken the windows.
Q. You didn't make any effort to get any greater number of
policemen to send there?
A. We had to … half a dozen places at the same time. We just done
the best we could, and possibly might have done better, if there had
not been so many strategists coming there to bother us.
Q. Did you send any policemen to protect the fire companies?
A. Why, yes, sir.
Q. Whom did you send?
A. I was there myself, with fifteen policemen.
Q. Whom did you offer assistance to?
A. Let me tell you.
Q. Just answer the question?
A. We can get to that better.
Q. Whom did you offer assistance to?
A. To the man in charge.
Q. Who was he?
A. I don't know what his name was.
Q. What street was it?
A. It was, as I think, at the corner of Twentieth and Liberty. You
can't understand this, unless you let me tell the story.
35. Q. At what time?
A. I can't give you any hour. I know nothing of time.
Q. You offered assistance to the man in charge. What was he doing?
A. He was throwing water on French's spring works. You better let
me tell the story. You are cutting it up.
Q. What did he say?
A. He says to me, says he, "I won't do it—I am not going to risk my
life—if you want to take charge of this thing you can do it."
Q. He was throwing water at that time without any molestation from
the mob?
A. Certainly; and the police was stationed across the street to
protect them. Whether they would have stood fire or not, I can't tell.
Q. What assistance did you offer him?
A. The police that were there within thirty feet of me.
Q. If he was not molested by the mob at that time, he wanted no
further assistance?
A. You won't let me tell this story straight. If you let me commence
at the beginning you will understand it.
Q. Did you offer assistance at any other time than the one you speak
of now?
A. I told you that I offered assistance on Saturday night, and it was
refused.
Q. To whom did you offer the assistance on Saturday night?
A. I sent Officer Coulson to the fire department to tell them to come
and aid the police.
36. Q. We have had Officer Coulson and his story?
A. On Sunday morning, when the fire had crossed Liberty street, I
went to hunt the chief of the fire department, and could not find
him, to concert measures with him. That is the time I talked about
the water arrangement. Then a man connected with the
Pennsylvania railroad came to me, and says he, "If I get an engine
at the corner of Twentieth street to throw water on the railroad cars
will you have the police force there to protect me?" Says I, "I will." I
immediately went and I gathered about fifteen policemen, as nigh as
I can guess, and had them at the corner of Twentieth street. I think
it is at the lower end of French's spring works. I had them there a
very long time, and no engine appeared. John Coyle, a member of
the bar here, came along and spoke to me, and I said to him, says I,
"John"—I told him the facts—"come along with me, I want to hunt
this thing up," and we went up to find the chief, and we didn't find
him. We found Commissioner Coates, the man that had a pistol at
his head and lived to tell the tale. He said he had an engine. I left
Mr. Coyle and came down. Coyle went about his business; and I saw
an engine coming down one of the cross-streets—Penn street—and I
went over to see where it was going, and it went away down town. I
went back to where I had the police stationed waiting for the engine
to come. After a very great delay, the engine came and attached to
a fire plug; but instead of throwing water upon the burning cars,
opposite to this street where we were, he commenced throwing
upon French's spring works. Then Mr. Houseman I think it is—the
gentleman who had made the request of me—I went to him and
said something to him, and he came back to me and said, "These
men won't do anything. You come and see what you can do." I went
over to him, and the answer he made was he was not going to risk
his life, but if I wanted to take charge of it I could do so. But I didn't
do so. Then the police—they were few in number, and not able to do
anything—I just told them to go and do what they could. Then I
went down town, and knew the result of the citizens' meeting.
37. By Mr. Larrabee:
Q. You said you did not agree, nor could not agree with the plan
adopted by the sheriff and the troops, or the officers of the troops,
in charge of matters, and at the same time the directions you gave
your police was to be careful, and not excite the crowd, and not
make these arrests. Are we to infer from that, that your plan was
that you must not oppose force to them, you must handle them
gingerly and tenderly. Is that what we must infer?
A. No, sir; every occasion presents its own line of action.
Q. The troops and the sheriff were trying to oppose the crowd by
force and stop the riot, and you say you did not agree with their
plan of action?
A. I don't. I think that the military force is only to be used in case of
the very last resort.
Q. In ordering your policemen not to make these arrests, are we to
infer——
A. Infer and understand this, that in ordering these policemen to be
careful how they made arrests, it was after I had considered I had
been superseded, and I wanted them to make the arrests when they
made them in such a way as not to create any disturbance.
Q. Are we to infer from your evidence upon that point that your
manner of managing such a mob would be to give way to them, and
not oppose force to the crowd?
A. I have said nothing, I think, to indicate that.
Q. What would be your plan in such a case?
A. I would have policemen to do it. I don't think the policemen
would create such a truculent feeling as an arrest by the use of
military.
38. Q. You think then that the police are the proper force to use on such
occasions?
A. Until you ascertain you can do nothing with them, until all other
means have failed, and then, and not till then, are the military to be
used.
Q. Did you attempt at any time on Sunday to gather your police
force in a body so as to have an organized force large enough to
accomplish something?
A. I could not get any force on Sunday large enough.
Q. You got fifteen—you say there was fifty or sixty policemen—did
you undertake to gather that body?
A. I did not say there was fifty or sixty policemen. I am talking now
about the night before.
Q. I think the question was asked you how many there was about
there on Sunday?
A. I could not tell how many were there. I know only a small body of
them could be got together, and then they began to collect the men
who had went home in the morning before we knew that the
soldiers had been withdrawn—they began to gather in before dark—
then we had a pretty good force, and then with such assistance as
citizens gave, we broke the back of the riot—we knocked them right
and left.
Q. Hadn't whisky helped a good deal at that time to place them hors
du combat?
A. I don't know about it myself, I do not drink it.
Q. I did not ask you as a connoisseur.
A. I think it had the effect to make the crowd vicious. I thought so
when I was in their hands.
39. Q. This Sunday night and Monday morning was when you first began
to regain some control there?
A. We got control—from dark on Sunday evening we had control.
Q. The mob had kind of petered out then?
A. Yes, and they had been licked out by the police and citizens.
Q. Where had there been any set-to where the mob had been licked
—at what place?
A. At the Fort Wayne depot, at the intersection of Tenth and Liberty
street.
Q. What police had had the set-to with the crowd at the Fort Wayne
depot?
A. There was eight or ten policemen went there when the car was
afire, and they put that out, and they were assisted by citizens also.
Q. How large a crowd did they find to contend with?
A. I don't know, it was an accomplished fact. The mob began to
break in stores, and commenced at the corner opposite to Tenth on
Liberty street, and the police and the mob had the battle there.
Q. How many police were there engaged in that battle?
A. There was a considerable number.
Q. Do you know how large a crowd there was there?
A. I am told the streets are full.
Q. What kind of a crowd was it?
A. Breaking into stores.
Q. The same crowd that had been burning cars?
A. I don't know.
40. Q. What was it composed of—this crowd running about the streets?
A. They were composed of men and boys. We had another battle
with them at Seventeenth.
Q. This crowd that was plundering was easily dispersed at any time?
A. Easy. They were not people to be afraid of.
Q. Who were the people to be afraid of?
A. Those standing around doing nothing.
Q. Was there an apparent organization among them?
A. I don't know.
Q. Could you judge?
A. I don't know whether there was an organization; there appeared
to be a common feeling. I was astonished from the fact that I didn't
know them.
By Senator Yutzy:
Q. They appeared to be strangers?
A. They were strangers to me, I did not recognize them.
Q. In your intimate acquaintance with the people, you would take
them to be people from elsewhere?
A. I thought I knew the people about Pittsburgh, but I didn't know
these. I don't want to swear that they were strangers. I don't know
that I know. I was recognized, and I thought I ought to recognize a
great many of them.
Q. Those that were engaged in the act of rioting and police?
A. I am speaking more especially of those who captured me in the
railroad yard, and carried me out in front of the depot.
41. By Mr. Engelbert:
Q. They did that systematically, did they?
A. Oh, yes; carried me right out.
By Senator Yutzy:
Q. Did you, at any time during the riots, employ your night force in
the day time?
A. Such of them as we could get. Understand this, my idea of this
matter was that the soldiers, having possession of the railroad
property, were cooped up for the night, and that when daylight
would appear they would go out into the open ground, and take
possession of things. My idea was, they went into this place to
prevent being pushed back during the night. The great body of the
police force went off at six o'clock in the morning. I, supposing that
the police would have nothing to do, except to do street duty under
this excitement, and had instructed the chief of police to call upon
the discharged policemen, supposing that he could get plenty of
them, but that expectation was not realized, and not expecting that
the soldiers would leave the city at the time they did, had given no
orders to keep the night policemen on duty that morning; but when
I found that the soldiers had all dispersed, I telegraphed down to
the central station to detain such policemen as were there—and
there were some there—and they were detained, and they were on
duty all day.
Q. Did you make any effort to re-assemble the night police after you
ascertained they had left?
A. Could not do it.
Q. Did you make any effort?
A. Could not do it.
42. Q. Could not you find them?
A. You couldn't get a man to go after them—the great body of them
—until night would come. You would get them just as soon by
waiting until they came on duty.
Q. Didn't you have the address in your mind?
A. Yes; and knew where they lived. We had plenty to do without
doing that.
Q. Any more important duty to perform than to get these men to
assemble?
A. That would depend altogether upon what the man in charge
thought. I thought the most important duty was to have the police
up there—all we could get—and let them do what they could.
Q. Without calling on the night police?
A. If we had means of calling on the night force to gather them in, it
would have been done, but, to do so, we would have had to
abandon everything else for the time being. Possibly, that might
have been as well, though. When I went to the corner of Seventh
and Grant streets, I found the firemen playing there, and the police
having charge of the ropes—keeping the crowd away from them.
Q. Did you employ all your powers during these riots, regardless of
any other efforts adopted to subdue the riots, in preserving the
peace?
A. What do you call during the riots?
Q. The time from Thursday until Sunday?
A. Because I didn't think there was any riot before five o'clock on
Saturday.
Mr. Lindsey: That question requires a direct answer—yes or no.
43. By Senator Yutzy:
Q. Did you exhaust all your powers during the riots, irrespective of
these other parties?
A. I say there was no riot until four or five o'clock in the morning,
when the soldiers charged bayonets on the crowd.
Q. Including all within the time from Thursday until Monday, did you
exhaust——
A. I knew of no riots until the soldiers charged bayonets on the
people. I have answered that question a dozen of times.
Q. Answer it yes or no?
A. I will not answer it yes or no. All my powers were exhausted in
preserving the peace so far as I thought I could exercise them. That
is the answer to that question.
Q. Have you any call—is there any call to assemble the police, by
telegraph or otherwise?
A. We have a police telegraph from each station-house. We send
messages on it every day.
Q. There is no particular call by which you assemble your police?
A. There is no alarm.
By Mr. Lindsey:
Q. I want to ask the mayor a question in connection with his answer
to this. He says he used all his powers in preserving the peace, so
far as he could exercise them. Was there anything to prevent you
from exercising your powers as mayor?
A. Yes; the ground had been occupied by the State military and the
sheriff, and occupied in a way that it was utterly impossible for me to
44. act with them.
Q. And it was the only thing that prevented you from exercising your
powers?
A. I will say that there was a party went down to the depot—the
Duquesne depot—Sunday afternoon, stating he was going to set it
afire. That man was arrested by the police, assisted by some
citizens, and taken to the lock-up.
Q. You know that there was an assemblage of men at or near
Twenty-eighth street during the day, on Friday, don't you?
A. I presume there was, or Mr. Watt would not have come down
there and asked for police?
Q. For the purpose of protecting trains going out?
A. No, sir; I didn't know that. I don't think I knew that.
Q. For what purpose were they assembled there, so far as you
know?
A. I only knew about them from Mr. Watt, and what he told me, I
have forgotten now.
Q. You have forgotten what he told you?
A. Yes, sir.
Q. Did you take any measures to ascertain what the purpose of the
assemblage was?
A. I think Mr. Watt must have told me what it was, and I judge so.
The first thing I heard after the police went there, was that a man
had struck Mr. Watt.
Q. I want to know if you don't know that during the day on Friday,
and during the day Saturday, there was a large assemblage of men
at or near Twenty-eighth street?
45. A. I knew that by common report, and hearing the police talk.
Q. Was not that an unlawful assemblage of men?
A. It may have been an unlawful assemblage of men.
Q. Didn't you know it was an unlawful assemblage of men?
A. I don't know, I presume it would have been an unlawful
assemblage. I presume that they were there for an unlawful
purpose.
Q. You did not take any pains to disperse that assemblage?
A. Have I not answered that question a dozen times?
Q. What is your answer? Did you take any measures to disperse that
assemblage?
A. I didn't for the reason that I have given you—for the reason I
repeated a dozen times to different other questions, in different
forms. There is a good deal more I would like to tell you.
Q. You say on Thursday you sent police officers there, and they got
on a train, and they attempted to run that train out?
A. And couldn't run it out.
Q. Why didn't they run it out?
A. Because the engineer stepped down and out.
Q. Why did he step down and out?
A. Because he wanted to.
Q. Was there any men taken by force?
A. Oh, no.
Q. Was there a crowd there at that time?
46. A. I suppose there were a great many people there. I have no doubt
there was.
Q. Don't you think it was an unlawful assemblage, and that it was
your duty, as mayor, to have gone there, and have dispersed that
crowd?
A. The police were there preserving the peace. They were there and
preserved the peace to such an extent, that the police say that they
were on that train, and that train could go out. There was nothing to
hinder it, if the engineer had stuck to his post; but, instead of that,
he stepped off his engine, and left the police in charge. That is the
report of the police to me?
Q. Wasn't it your duty to disperse that crowd there, as mayor of the
city?
A. No; because I knew nothing of the details of that, at this time;
because Mr. Watt got all the police that he needed, and they got
more than they wanted—said they had more than they wanted, and
they had the direction of them there, and the presumption is that
the police did just what they wanted them to, and the only breach of
the peace that occurred there was that of which Mr. McCall was
arrested for—striking Mr. Watt—and taken to the station.
Q. Was not the train uncoupled? When they attempted to start that
train, didn't they rush on and uncouple the cars?
A. I guess you are talking about the trains they attempted to run
early in the morning, before the police came there. That is what I
think. It was on that occasion that Mr. Watt came down after the ten
policemen.
Q. Didn't Mr. Watt tell you of the circumstances?
A. I suppose he did.
47. Q. Didn't you have knowledge then that there had been a riot, or, at
least, a disorderly crowd there, and wasn't it your duty then to
protect those people?
A. And for the purpose of doing that, Mr. Watt came and asked for a
certain number of policemen—for what he thought was sufficient—
and they were soon there?
Q. And still you allowed that crowd to remain there?
A. That is not a fair way to put it.
Q. I want to get at the reasons that actuated you?
A. I didn't know anything of the nature of that crowd. I knew
nothing more at the time than that Mr. Watt wanted ten men, and
ten men was sufficient to control it. That was sufficient. They were
there, and there was only one breach of the peace, and that man
was arrested, and when this train, between three and four o'clock,
undertook to be run out, it could have been run out.
Q. Did the crowd intimidate the engineer in any way, do you know?
A. I understood the police that he was not intimidated—that he
could have gone out with the train, if he thought proper. They were
there to protect him in so doing. They told me he could have gone
out, if he had chosen. I don't know who he is, anything about him. I
guess it was the last effort made to run a train out.
By Senator Yutzy:
Q. Did you consider at any time until the military arrived that the
crowd that assembled there was an illegal crowd?
A. Oh, no; I didn't think it amounted to shucks.
Q. You consider there was no riot or mob nor illegal assemblage at
any time before the military arrived?
48. A. I knew that there were men in a crowd.
Q. Answer that question now. You consider there was no illegal
assemblage, mob, or riot previous to the arrival of the military?
A. I think that in the ordinary acceptation of the word mob and riot,
there was no mob and riot previous to the military coming there.
Q. Or illegal assemblage of people?
A. I think any persons that go on the Pennsylvania Railroad
Company's ground, don't obey their lawful orders and proper orders,
that it is an unlawful assemblage.
Q. Was there any illegal assemblage?
A. I have no doubt there was.
Q. Were you aware of that?
A. I must have been aware. It could not have been otherwise.
Q. Did you make any efforts to disperse them?
A. Yes; I gave the Pennsylvania Railroad Company all the police they
asked for.
Q. Did you drive them off?
A. I don't think they were driven off, but the Pennsylvania railroad
got all the police they asked for.
Q. You didn't give them the officer they asked for?
A. In asking for me?
Q. Yes; you?
A. No; I was not going up to head ten policemen.
Q. You required them to pay the police also?
49. A. No, sir; you put your statement too broad. These policemen—we
took what policemen we could belonging to the city and filled up
with the others who were not in the pay of the city.
Q. And those others were paid?
A. I think there must have been about twenty-nine policemen
outside of such of the city folks as were considered.
Q. The extras were paid off by the Pennsylvania railroad?
A. Yes; they were paid by them.
By Mr. Lindsey:
Q. You didn't call on any of the night force to go at that time?
A. No, sir; we couldn't do that. Nothing but the most imperative
necessity would require that. We only had patrolmen to cover
twenty-seven square miles. At the riot on Saturday night every man
was called in from the first, second, fourth, seventh, eighth, and
ninth districts; they were left entirely unprotected.
At this point the committee adjourned until this afternoon, at two
o'clock.
AFTERNOON SESSION.
Pittsburgh, Friday, February 22, 1878.
The committee met, pursuant to adjournment, in the orphans' court
room at three o'clock, P.M., Mr. Lindsey in the chair.
All members present.
50. R. L. Hamilton, sworn:
By Mr. Lindsey:
Q. Where do you reside?
A. 810 Penn avenue.
Q. What is your business?
A. I am a clerk for the water-works of the city of Pittsburgh—clerk of
the water-works. I believe it is called, sometimes, clerk of the water
extension committee.
Q. How long have you held that position?
A. I have held the position of clerk of the water-works since
February, 1876—February 4, I believe.
Q. Where is your office?
A. City hall. Third floor of the city hall. Municipal hall as it is called.
Q. State whether you were at or in the vicinity of Twenty-eighth
street, on Saturday the 21st day of July?
A. I was.
Q. When the firing occurred?
A. I was in the vicinity at the time of the firing.
Q. Where were you—what was your position?
A. I can hardly understand the question.
Q. Where were you in relation to where the troops stood—explain
the situation you occupied?
A. At the time of the firing I was running.
Q. Which direction?
51. A. Well, towards Liberty street and Twenty-ninth street, to get a
brick house between me and the troops.
Q. Go on, and relate what you saw, commencing at the time you
arrived at, or in the vicinity of Twenty-eighth street?
A. To explain the question, there was a meeting of the water
committee called for Monday evening, and some two or three
members of the water committee lived out in that direction. I started
at that notice, and at two o'clock I arrived at Twenty-eighth street. I
went up Twenty-eighth street to the Pennsylvania railroad tracks,
and when there, I was informed that the Philadelphia troops were
about to come out, and I waited to see them until sometime after
four o'clock. These troops came out headed by the sheriff and
several citizens of Pittsburgh, and after they had formed themselves
in position, the sheriff commenced speaking to the crowd, and I
couldn't hear what he was saying from where I was standing, and I
got on a coal truck where I thought I could hear what he was
saying. When I was on this truck, one company of the Philadelphia
troops—the troops, at that time, were formed in two lines facing the
hill, that is, the line next me was facing the hill. I wouldn't say
positively about the line nearest the hill. I was near the round-
house. There was one company of the Philadelphia troops brought
up in single rank, they marched up very quietly until they got to the
switch below Twenty-eighth street. They were met by the crowd,
that is, a crowd of men that refused to go any further. There were
orders given very quietly, and another company, with black plumes
on their hats, came up, and this first company was put in double
rank. They tried to force the crowd back, and the order was given to
charge bayonets. The officers of the Philadelphia troops were in the
rear of those two companies, they were charged up on the track,
and after sometime, there was an order given to fire by the different
officers of the Philadelphia troops.
52. Q. I wish you would now repeat what you said, beginning with the
order which was given to charge bayonets, commencing about
there, and repeat what you said?
A. After the second company had been brought up—the company
with dark plumes on their hats, I cannot tell what the uniform was—
after that, there was an order given to charge bayonets, and it was a
very short time after this order to charge bayonets—that was only
given to the two companies, the other files were standing, the rest
of the Philadelphia troops were standing in two lines on each side of
the railroad track—after that order given to charge bayonets, almost
immediately, I heard the command given by several officers of
Philadelphia companies, that is, I suppose they were from
Philadelphia. I don't know them personally, but from their uniform,
and from the position in which they were. The order to fire was
given by several men in the uniform of officers of that regiment.
Q. Where did you stand during this time?
A. I stood on a truck loaded with coal. The left of the railroad tracks
going out almost immediately in front of the sand-house of the
Pennsylvania railroad, this side of Twenty-eighth street.
Q. How far from the tracks?
A. I could have stooped down and touched three of the militia with
my hands, by stooping.
Q. How far were you from them at the time the order to charge
bayonets was given?
A. I was in the same position. I had not left that position from the
time I got up there to see what was said by the sheriff until I heard
the order given.
Q. What officers gave the order to charge bayonets?
53. A. I couldn't say. I heard, but I couldn't say how it was given. The
orders at that time were given very low. It was not to the whole
regiment.
Q. From what direction did the order come?
A. Right from the rear of the two companies that were marched up
the track, and they were not charging when the order was given.
Q. How did they have their arms when the order to charge bayonets
was given?
A. The two companies, I think the whole of them, were at carry
arms, from what I know of the present tactics.
Q. Were any of them at arms port?
A. Some of them in the charging parties had their guns at arms port
—some of the charging party.
Q. Did you hear that command given?
A. No, sir; I didn't hear that command given, but I know now that
some of them had their guns at arms port, because I remember the
guns being in the position of arms port—some of them. A party
directly in front of me were at carry arms.
Q. They were standing still?
A. Yes. They were in line. I think they were at a carry, so far as I can
remember. I cannot swear positively as to that.
Q. When you heard the command given to charge bayonets, how
close were those two companies to the mob?
A. Just as close as they could get.
Q. And the mob resisted them?
A. Yes, sir.
54. Q. When the order was given to charge bayonets, did the two
companies obey the order.
A. Part of them did. I could see them lunge with their bayonets—try
to force them back.
Q. Did the crowd resist that charge?
A. Some of them did; yes, sir.
Q. And attempted to pull——
A. I heard parties say that if they would let them out in any way,
they would be glad to do so. It was the crowd back of them that was
holding them in. Others resisted.
Q. Did they try to pull the bayonets off the guns?
A. I saw them wrenching with the guns. Saw them wrenching the
guns, and heard remarks made by different parties in front of the
party charging bayonets that if they would give them room to get
back they didn't want to interfere. I heard these remarks made from
where I was.
Q. And the command to fire, you say, was given by captains?
A. I don't know about captains. I say officers of the Philadelphia
companies that the word "fire" was given by.
Q. By officers of companies?
A. Company officers is what I say the word was given by.
Q. And not by field officers?
A. I wouldn't know that the field officers were with that regiment,
but I knew from the position——
By Mr. Reyburn:
55. Q. You mean from the position they occupied, they were company
officers?
A. I suppose they were company officers. They were in the rear of
the two ranks facing me.
Q. Had any stones and missiles been thrown at the soldiers before
the command to charge bayonets was given?
A. I cannot say positively as to before the command to charge
bayonets was given.
Q. Were any thrown at the troops before the command to fire was
given? Were there any shots fired by the crowd before the command
to fire was given?
A. Not that I either saw or heard—not before the command to fire.
Q. Missiles had been thrown?
A. They had been thrown—I saw them thrown.
Q. Were any of the soldiers hurt?
A. Not that I saw. I saw one of the officers—I supposed to be a field
officer—saw him hit, and it staggered him, but he didn't seem to be
hurt—kind of shoved him to one side—it seemed to be a piece of a
board or piece of wood—something like a block of wood—it was
thrown from the hill side, and hit one of the officers. I saw that
myself—not thrown from the hill side, but from what they call the
watch-box—it is a watch-box. It was thrown from the back of that by
a boy.
Q. You saw the boy?
A. It was a young fellow about sixteen or seventeen years of age,
from what I could judge from his appearance.
Q. When the firing commenced, you ran?
56. A. I ran before the firing commenced. I was back of what they call
the Hill house.
Q. Did you run before the command was given?
A. No, sir; I didn't. Whenever I heard the command given, I thought
I had no business there, and I got out of the road, that is one thing
that made me so positive the command was given. My idea of
getting out of the road was on account of that command to fire.
Q. In what words—was there more than one command?
A. There was no more than one command. The word fire was given
by different men in uniform. They were standing not in the rear, but
in front of the line of militia that was right in front of me. I heard
that from more than one voice.
Q. In what words was the command given?
A. The command I speak of as given by those parties, was the word
"fire."
Q. Addressed to any particular person?
A. Not by those parties—just "fire."
Q. How do you know who gave that command?
A. I could hear them; I don't suppose I was six feet from some of
them.
Q. Could you pick out the men who gave the command?
A. That gave the word fire?
Q. Yes?
A. No, sir; I couldn't.
Q. Then you don't know who it was that gave the command?
57. A. That gave these commands? No, sir.
Q. You say it came from officers in command of a company?
A. It came from what I supposed by the position they held—they
were strangers to me.
By Mr. Reyburn:
Q. Couldn't you distinguish the officers from the private?
A. I thought I could. It was what I consider officers. I didn't pay that
much attention. I had no idea there was going to be such a
command given, and paid no attention to officers nor privates.
These parties had no guns. Whether they were captains or
lieutenants, or what, I couldn't say.
Q. You wouldn't pretend to say what man it was gave the command,
or pick out the man?
A. That gave this command I speak of? No, sir.
Q. You could only tell the direction in which the words came?
A. If they had been Pittsburgh troops had been there, I suppose I
could have told every man of them. I could not point out the men if
they were brought before me now.
Q. Could you see the man who uttered the words?
A. Yes, sir.
Q. So as to pick him out?
A. I could, provided I had seen enough of the man. I couldn't
remember him now. I believe if I could see the man that I first heard
these words "fire" from; if I would have seen him the next morning,
I could point him out. I don't remember of having seen him since,
and I don't know that I could point him out if he was here.
58. Q. How was he dressed?
A. Dressed in a gray uniform? He was in full uniform, with gold lace
on it.
Q. What rank did his uniform indicate?
A. I didn't pay that much attention to him to find out what his rank
was. The militia uniform is so badly mixed, I could hardly tell what
the man's rank would be. The uniform seemed to be about the same
in all the officers. I didn't pay any attention to these troops as
regards that.
By Mr. Reyburn:
Q. Had he a plume, the same as the privates?
A. I couldn't say.
Q. Didn't notice?
A. No, sir.
By Senator Yutzy:
Q. How many officers did you hear give this command to fire?
A. I couldn't say exactly. I suppose seven or eight.
Q. All gave the command to fire?
A. Yes, sir; that is, I heard it in that many different voices; I couldn't
say how many officers, but in that many different voices.
Q. Not at one and the same time?
A. Not at one and the same time.
Q. Did any other words precede the word "fire?"
A. Not by the officers I speak of.
59. Q. Nothing but simply "fire?"
A. Simply "fire."
By Senator Reyburn:
Q. You are sure they didn't say not to fire, and you only heard the
word "fire?"
A. I am sure of the parties I speak of.
Q. That they were not cautioning their men not to fire on the crowd?
A. No, sir; I am sure of that.
Q. Couldn't you have made a mistake, and only heard the last word?
A. Not from the position I was. The parties may have been mistaken
in regard to where they got their order.
Q. When they were ordered to charge bayonets, what was the
command given to charge bayonets?
A. As I spoke before, the command was given, that I could hear the
command but couldn't hear what was said to the troops. It was
given to two companies in a low tone of voice, but what I
understood to be "charge bayonets," and a charge bayonets was
immediately made after this order. It was in a low tone of voice.
Q. Not as a military officer ought to give a command?
A. Not as I would suppose a military officer should give a command.
I am not posted in regard to how they should give it.
Q. He didn't say it as though he meant business?
A. It looked very much like it.
Q. He gave it in a low tone of voice?
60. A. Just gave it in a low tone of voice to those two companies—it was
a command to those two companies.
Q. When he gave the command fire, did he speak it distinctly as
though he meant exactly what he said?
A. Who are you speaking of?
Q. The officers that gave the command?
A. Yes; they spoke it distinctly.
Q. As though they meant exactly what they said?
A. I supposed from that they meant it, that is the reason I got out of
the road. I thought they meant what they said.
By Senator Yutzy:
Q. What position did those officers occupy when this command to
fire was given. The officers I speak of giving the word "fire?"
A. They were in front of the command.
Q. In front of the rank?
A. In front of the rank. There was no room for them in place else.
Q. You are sure they were in front of the rank?
A. Yes, sir.
Q. Seven or eight of them, you say?
A. If you will allow me to explain about the officers. Six, seven, or
eight. There was two ranks of troops, stretching from the switch at
Twenty-eighth street down the track in two ranks, and those two
companies were at the upper end. What I supposed to be the
general officers were in the rear of those two officers, and the other
officers were scattered down along. There was two lines. There was
61. seven or eight not scattered along, because they were over near to
what I considered to be the generals.
Q. They were in front of the rank?
A. The line was facing this way. [Illustrating.] There was no officers
outside of this rank [indicating] that I could see, and there was no
room in this rank, because here is a truck—a coal truck. I stood from
where I could stoop down and touch the soldiers.
Q. Wouldn't you suppose this was a pretty bad place for an officer to
stand?
A. I should think it was.
By Senator Yutzy:
Q. These officers stood between the mob and their men?
A. No, sir.
Q. They were behind the men?
A. What I consider the mob was at the switch at Twenty-eighth
street. That was the switch here. [Illustrating.] The Philadelphia
troops were formed in two ranks. There was the two companies
coming up here, [indicating,] one in single file, and when they got to
the switch the men stopped them. They were in single line. This
company was brought up between the two lines, forcing every
person out, keeping that part of the track clear. They succeeded until
they got to this switch. When they got to the switch one company
was not successful in driving them back.
By Senator Reyburn:
Q. You said the officers were in front of the men, did you mean
those men that were standing in line? The officers were in front of
them, was the ones you speak of?
62. A. Yes, sir.
Q. It was these officers gave the command to fire?
A. These officers I was speaking of.
Q. It was not the men that were marching up to clear the crowd—I
mean marching towards the crowd?
A. It was not those officers I heard.
Q. It was the bystanders? Those officers had nothing to do with
those companies?
A. No, sir; not with those two companies up the track—no, sir.
Q. Did the companies commanded by the officers who gave the
command, fire?
A. I didn't wait to see.
Q. You don't know that they did fire?
A. Not from my own knowledge, but from the parties wounded and
killed, I would suppose so.
By Mr. Larrabee:
Q. How long after the command was given did you hear the firing?
A. I got back of this house before I heard any firing.
Q. What distance was you from the crowd, where you stood, when
the command was given, when the firing began?
A. I suppose I would be a distance about forty yards, before I heard
any firing.
Q. After the command to fire was given, you retreated to the oil-
house?
63. A. I got the oil-house between me and the Philadelphia troops.
Q. How far was that from where you stood when the command was
given?
A. I think it was forty yards from where I stood on the track.
Q. How long after you got to the oil-house, did you hear the firing?
A. I could hardly tell—it was a very short time. I don't think you
could count a minute.
Q. You think you were not behind the oil-house one minute before
the firing began?
A. Until I heard the firing.
Q. You started as soon as ever you heard the command to fire?
A. Just as soon as I could get off the track. As soon as I heard the
command "fire," I commenced my way back in this crowd on the
track, just as quick as I could get off and run.
Q. About how long did it take you to get through that crowd and
behind the oil-house?
A. Didn't take me very long. I was not very long getting there, I
know that.
Q. A minute?
A. I do not think I was a minute getting off the track. I was over a
minute getting behind the oil-house.
Q. You were there not over a minute before you heard the firing?
A. I am sure of that.
Q. Do you think it was two minutes after the order to fire was given,
before the firing began?
64. A. I think so; yes, sir.
By Mr. Reyburn:
Q. Where did these stones and missiles come from?
A. The things I saw thrown were right from back of what we call a
switch-tender's shanty. There is a little shanty we call the switch-
tender's shanty. It was parties standing back of that—I could see it
from where I was standing—most of them that were thrown.
Q. How much of a shower of stones was it?
A. There was no shower. There was not even a slight storm. It was
not what I would call a shower of stones.
Q. Only two or three stones thrown?
A. There might have been—I guess I saw six or seven. There were
lumps of mud and pieces of wood. I do not think I saw a stone. I did
see mud—that is, hard mud seemed to be taken from the side of the
hill.
Q. Did you see one of those soldiers fall, in the ranks that marched
down there?
A. Yes, sir; there was one of them fell, and they picked him up, and
took him into the hospital grounds. He was sun-struck, or something
of that kind.
Q. How do you know he was sun-struck?
A. That is what some of his comrades claimed. Before they got to
Twenty-eighth street this man dropped. He seemed to be a Jew,
from his looks. The boys used the expression: "Let the damned Jew
lay there." The railroaders got water for him, and bathed him.
Q. Have you ever told anybody that you heard the firing there, and
heard the command given to fire?
65. A. I was a witness in the criminal court, in the murder case against
General Pearson.
Q. Have you told anybody outside that you heard the command to
fire given?
A. I believe I did.
Q. Have you told persons you heard General Pearson give the
command to fire?
A. Not in direct words.
Q. Have you not stated several times, on the street corners, to
different parties, that you heard General Pearson give the command
to fire?
A. No, sir; I do not think I ever did—not in those words.
Q. Did you ever state to anybody that you had heard the
commanders of companies give the command to fire, before stating
it here?
A. I do not know. I forget exactly just what words my testimony was
in the court.
Q. I am not asking you what testimony you gave in the court. Have
you ever stated to any person before to-day, outside of the court, or
anywhere, that you heard officers of companies give the command
to fire?
A. I believe I have. Yes, sir.
Q. And you have stated that you heard General Pearson give the
command to fire?
A. Not in those words.
Q. What do you mean by "Not in those words?"
66. A. I think the order to fire emanated from General Pearson, but I
never said, in direct words, that General Pearson gave the order to
fire.
Q. It was only a supposition of yours?
A. No; it was from the remark that I have sworn—I heard General
Pearson give this—my remark was that General Pearson had turned
around to other officers, with whom I am not acquainted, and used
the expression, "Your men to fire;" but I did not say he had coupled
those words with "Order your men to fire."
Q. Did you hear him say those words?
A. I have sworn. Yes, sir.
Q. To whom?
A. As I told you, I was not acquainted with the officers to whom he
addressed himself. He was speaking to parties in gray uniform. He
was standing almost immediately in his rear.
Q. He said, "Your men to fire?"
A. Yes, sir.
Q. How far were you from him?
A. I suppose I would be—I could hardly judge the distance—I would
take it to be about ten feet or so.
Q. Did he speak it in a low tone?
A. It was not very loud. It was not a low tone.
Q. Was there a good deal of noise and confusion about at that time?
A. Oh, considerable, just in certain localities.
Q. The crowd was boisterous, were they not?
67. A. To a certain extent.
By Senator Reyburn:
Q. You did not hear any command given to fire, positively, by
General Pearson?
A. No, sir; I never said so.
By Senator Yutzy:
Q. How do you account for the long interval of time intervening
between the command to fire and the firing.
A. I could not say.
Q. Did they load after the command to fire was given?
A. I could not say.
Q. Did you see them load?
A. No, sir; I did not see them fire.
By Senator Reyburn:
Q. There was nothing preparatory at all, to this word fire.
A. No, sir; I thought it very strange myself, at the time the command
to fire was given. They were not even ready.
By Senator Yutzy:
Q. You say you heard General Pearson speak to those officers, and
said something about firing. You do not know whether he said not
allow the men to fire, or to fire?
A. No, sir.
68. Welcome to our website – the ideal destination for book lovers and
knowledge seekers. With a mission to inspire endlessly, we offer a
vast collection of books, ranging from classic literary works to
specialized publications, self-development books, and children's
literature. Each book is a new journey of discovery, expanding
knowledge and enriching the soul of the reade
Our website is not just a platform for buying books, but a bridge
connecting readers to the timeless values of culture and wisdom. With
an elegant, user-friendly interface and an intelligent search system,
we are committed to providing a quick and convenient shopping
experience. Additionally, our special promotions and home delivery
services ensure that you save time and fully enjoy the joy of reading.
Let us accompany you on the journey of exploring knowledge and
personal growth!
textbookfull.com