This thesis examines an unsupervised approach to classifying users in online social networks using only simple statistics about users' behavior. The author applies sparse principal component analysis (SPCA) to Twitter data without using text or profile content. Key contributions include:
1. Demonstrating that meaningful user classification is possible using only statistics on network structure and communication patterns.
2. Developing a "semantic robustness" score to evaluate how well classifications retain meaning when reanalyzing subsets of the data.
3. Identifying distinct types of users from the top principal components, including measures of influence, spam detectors, and content providers.