He Says, She Says: Conflict and Coordination in Wikipedia Aniket Kittur,  Bongwon Suh, Bryan Pendleton, Ed Chi UCLA Augmented Social Cognition Group Palo Alto Research Center
What is Wikipedia? “ Wikipedia is the best thing ever.  Anyone in the world can write anything they want about any subject, so you know you’re getting the best possible information.”   –  Steve Carell,  The Office
Spreading conflict
Spreading conflict
Spreading conflict
Spreading conflict
Spreading conflict
Policy and procedure “ The degree of success that one meets in dealing with conflicts... often depends on the efficiency with which one can quote policy and precedent.” - Wikipedia admin (survey data)
Collaborative work beneath the surface Visitors only look at article pages But much of Wikipedia comprised of other pages Conflict resolution, coordination, policies and procedures
Characterizing coordination and conflict
Characterizing coordination and conflict
Exponential growth
Costs of growth Increase in conflict and coordination costs Software development  (Boehm, 1981; Brooks, 1975) MUDs/MOOs  (Curtis, 1992; Dibbell, 1993) Mailing lists  (Sproull & Kiesler, 1991) How has growth affected Wikipedia? Millions of new users and articles
Infrastructure Analyze entire history of Wikipedia Every edit to every article Large amount of data 4+ million pages 58+ million revisions 800+ Gb as of June 2006 Distributed processing Hadoop distributed filesystem Map/reduce to process data in parallel
Types of work Direct work  Immediately consumable Indirect work Coordination, conflict Maintenance work  Reverts, vandalism Article Talk, user, procedure
Less direct work Decrease in proportion of edits to article page 70%
More indirect work Increase in proportion of edits to user talk 8%
More indirect work Increase in proportion of edits to user talk Increase in proportion of edits to procedure 11%
More maintenance work Increase in proportion of edits that are reverts 7%
More wasted work Increase in proportion of edits that are reverts Increase in proportion of edits reverting vandalism 1-2%
Global level Conflict and coordination costs are growing Less direct work (articles) More indirect work (article talk, user, procedure) More maintenance work (reverts, vandalism)
Characterizing coordination and conflict
Conflict at the article level What defines conflict in articles? Build a characterization model of article conflict Identify page features and metrics associated with conflict Automatically identify high-conflict articles
Page metrics Chose metrics for identifying conflict in articles Easily computable, scalable Article Reverts (#, by unique editors) Article, talk Minor edits (#, %) Article, talk Administrator edits (#, %) Article, talk Anonymous edits (#, %) Article, talk Links to other articles Article, talk Links from other articles Article, talk Unique editors / revisions Article, talk, article/talk Unique editors Article, talk, article/talk Page length Article, talk, article/talk Revisions (#) Page Type Metric type
Defining conflict Operational definition for conflict  Revisions tagged controversial Conflict revision count
Machine learning Predict conflict from page metrics Training set of “controversial” pages Support vector machine regression predicting # controversial revisions  (SMOreg; Smola & Scholkopf, 1998) Not just conflict/no conflict, but  how much  conflict
Performance: Cross-validation 5x cross-validation, R 2  = 0.897
Performance: Cross-validation 5x cross-validation, R 2  = 0.897
Determinants of conflict —  Revisions (talk) —  Minor edits (talk) ˜  Unique editors (talk) —  Revisions (article) ˜  Unique editors (article) —  Anonymous edits (talk) ˜  Anonymous edits (article) Highly weighted metrics of conflict model:
Identifying untagged articles Detect conflicts for unlabeled articles Majority of articles have never been conflict tagged Testing model generalization Applied model to untagged articles Sample rated by expert Wikipedians Significant positive correlation with predicted scores By rank correlation,  p  < 0.013 (Spearman’s rho)
Characterizing coordination and conflict
Conflict at the user level How can we identify conflict between users? Reverts as a proxy for user conflict Revert patterns between users Force directed layout to cluster users Group similar viewpoints Find conflicts between groups
Dokdo/Takeshima opinion groups Group A Group B Group C Group D
Terry Schiavo Mediators Sympathetic to parents Sympathetic to husband Anonymous (vandals/spammers)
Summary: Characterizing Wikipedia Coordination costs and conflict are increasing Global-level: Trend identification Decrease in direct article work Increase in indirect coordination work Increase in maintenance work Article-level: Prediction using Machine learning Identify characteristics of article conflict Detect conflict-heavy articles needing extra attention User-level: User Conflict Visualization Make sense of user conflicts and identify shared viewpoints
Future Work Applied to many domains Corporate memory (Socialtext) Intelligence gathering (Intellipedia) Scholarly research (Scholarpedia) Collaborative problem solving (Lostpedia) Application: Social Dashboard Identify high conflict articles Surface editing patterns to readers Route attention to articles that need it most
Future work
He Says, She Says: Conflict and Coordination in Wikipedia Aniket Kittur,  Bongwon Suh, Bryan Pendleton, Ed Chi UCLA Augmented Social Cognition Group Palo Alto Research Center Thank you!

More Related Content

ODP
Coordinating Documentation and Support: Turning Complaints into Contributions
PDF
Analyzing Multidimensional Networks within MediaWikis
PPTX
September 23 2015 NISO Virtual Conference: Scholarly Communication Models: Ev...
PDF
Web Page Revisitation Revisited
PPT
Research on Social Dynamics in Wikipedia
ODP
Talk before you type: coordination in Wikipedia
PPTX
Wikipedia DC Briefing
PPT
China HCI Symposium 2010 March: Augmented Social Cognition Research from PARC...
Coordinating Documentation and Support: Turning Complaints into Contributions
Analyzing Multidimensional Networks within MediaWikis
September 23 2015 NISO Virtual Conference: Scholarly Communication Models: Ev...
Web Page Revisitation Revisited
Research on Social Dynamics in Wikipedia
Talk before you type: coordination in Wikipedia
Wikipedia DC Briefing
China HCI Symposium 2010 March: Augmented Social Cognition Research from PARC...

Similar to CHI2007 talk on Conflicts in Wikipedia (20)

PPT
Dynamics Of Wikipedia
PPT
dynamics-of-wikipedia-1196670708664566-3
PDF
Designing for 100,000 stakeholders
PPT
Understanding and improving Wikipedia article discussion spaces SAC2011
PDF
BayCHI: Wikimania Redux
PDF
4_5879463705967595709.pdf
PDF
Conflict_and_conflict_management_Reflect.pdf
PPT
Topic based and structured authoring - slides
PPT
Topic based and structured authoring - slides
PPT
Wikipedia as an Ontology for Describing Documents
PPTX
Conflict
PPTX
Conflict Management.pptx
PDF
Dispute Resolution on the English Wikipedia
PDF
The Art and the Science of Moderating Discussions
PDF
Gic2011 aula0-ingles
PDF
Weeki - Wikipedia &lt;- tweets
PPTX
2018 SAAM Art and Feminism Wikipedia Edit-a-thon
DOCX
Yehuala Literature review MT&P.docx
DOCX
Paper #1 and #2 Assignments COM 4462 Conflict Management .docx
DOCX
Paper #1 and #2 Assignments COM 4462 Conflict Management .docx
Dynamics Of Wikipedia
dynamics-of-wikipedia-1196670708664566-3
Designing for 100,000 stakeholders
Understanding and improving Wikipedia article discussion spaces SAC2011
BayCHI: Wikimania Redux
4_5879463705967595709.pdf
Conflict_and_conflict_management_Reflect.pdf
Topic based and structured authoring - slides
Topic based and structured authoring - slides
Wikipedia as an Ontology for Describing Documents
Conflict
Conflict Management.pptx
Dispute Resolution on the English Wikipedia
The Art and the Science of Moderating Discussions
Gic2011 aula0-ingles
Weeki - Wikipedia &lt;- tweets
2018 SAAM Art and Feminism Wikipedia Edit-a-thon
Yehuala Literature review MT&P.docx
Paper #1 and #2 Assignments COM 4462 Conflict Management .docx
Paper #1 and #2 Assignments COM 4462 Conflict Management .docx
Ad

More from Ed Chi (20)

PDF
2017 10-10 (netflix ml platform meetup) learning item and user representation...
PDF
HCI Korea 2012 Keynote Talk on Model-Driven Research in Social Computing
PDF
Location and Language in Social Media (Stanford Mobi Social Invited Talk)
PDF
Crowdsourcing using MTurk for HCI research
PDF
CIKM 2011 Social Computing Industry Invited Talk
PDF
WikiSym 2011 Closing Keynote
PDF
CSCL 2011 Keynote on Social Computing and eLearning
PDF
Replication is more than Duplication: Position slides for CHI2011 panel on re...
PDF
Tutorial on Using Amazon Mechanical Turk (MTurk) for HCI Research
PDF
Crowdsourcing for HCI Research with Amazon Mechanical Turk
PDF
Eddi: Topic Browsing of Twitter Streams
PDF
Large Scale Social Analytics on Wikipedia, Delicious, and Twitter (presented ...
PDF
Model-based Research in Human-Computer Interaction (HCI): Keynote at Mensch u...
PDF
Zerozero88 Twitter URL Item Recommender
PDF
Smart eBooks: ScentIndex and ScentHighlight research published at VAST2006
PDF
Model-Driven Research in Social Computing
PPTX
ASC Disaster Response Proposal from Aug 2007
PPT
Using Information Scent to Model Users in Web1.0 and Web2.0
PDF
2010-03-10 PARC Augmented Social Cognition Research Overview
PDF
2010-02-22 Wikipedia MTurk Research talk given in Taiwan's Academica Sinica
2017 10-10 (netflix ml platform meetup) learning item and user representation...
HCI Korea 2012 Keynote Talk on Model-Driven Research in Social Computing
Location and Language in Social Media (Stanford Mobi Social Invited Talk)
Crowdsourcing using MTurk for HCI research
CIKM 2011 Social Computing Industry Invited Talk
WikiSym 2011 Closing Keynote
CSCL 2011 Keynote on Social Computing and eLearning
Replication is more than Duplication: Position slides for CHI2011 panel on re...
Tutorial on Using Amazon Mechanical Turk (MTurk) for HCI Research
Crowdsourcing for HCI Research with Amazon Mechanical Turk
Eddi: Topic Browsing of Twitter Streams
Large Scale Social Analytics on Wikipedia, Delicious, and Twitter (presented ...
Model-based Research in Human-Computer Interaction (HCI): Keynote at Mensch u...
Zerozero88 Twitter URL Item Recommender
Smart eBooks: ScentIndex and ScentHighlight research published at VAST2006
Model-Driven Research in Social Computing
ASC Disaster Response Proposal from Aug 2007
Using Information Scent to Model Users in Web1.0 and Web2.0
2010-03-10 PARC Augmented Social Cognition Research Overview
2010-02-22 Wikipedia MTurk Research talk given in Taiwan's Academica Sinica
Ad

Recently uploaded (20)

PDF
IT-ITes Industry bjjbnkmkhkhknbmhkhmjhjkhj
PDF
Connector Corner: Transform Unstructured Documents with Agentic Automation
PDF
INTERSPEECH 2025 「Recent Advances and Future Directions in Voice Conversion」
PDF
Enhancing plagiarism detection using data pre-processing and machine learning...
PDF
Accessing-Finance-in-Jordan-MENA 2024 2025.pdf
PPTX
GROUP4NURSINGINFORMATICSREPORT-2 PRESENTATION
PDF
A symptom-driven medical diagnosis support model based on machine learning te...
PDF
SaaS reusability assessment using machine learning techniques
PDF
A hybrid framework for wild animal classification using fine-tuned DenseNet12...
PDF
Introduction to MCP and A2A Protocols: Enabling Agent Communication
DOCX
Basics of Cloud Computing - Cloud Ecosystem
PDF
Transform-Your-Streaming-Platform-with-AI-Driven-Quality-Engineering.pdf
PPTX
AI-driven Assurance Across Your End-to-end Network With ThousandEyes
PDF
MENA-ECEONOMIC-CONTEXT-VC MENA-ECEONOMIC
PDF
CXOs-Are-you-still-doing-manual-DevOps-in-the-age-of-AI.pdf
PDF
Auditboard EB SOX Playbook 2023 edition.
PDF
Dell Pro Micro: Speed customer interactions, patient processing, and learning...
PDF
Transform-Quality-Engineering-with-AI-A-60-Day-Blueprint-for-Digital-Success.pdf
PDF
Co-training pseudo-labeling for text classification with support vector machi...
PDF
Rapid Prototyping: A lecture on prototyping techniques for interface design
IT-ITes Industry bjjbnkmkhkhknbmhkhmjhjkhj
Connector Corner: Transform Unstructured Documents with Agentic Automation
INTERSPEECH 2025 「Recent Advances and Future Directions in Voice Conversion」
Enhancing plagiarism detection using data pre-processing and machine learning...
Accessing-Finance-in-Jordan-MENA 2024 2025.pdf
GROUP4NURSINGINFORMATICSREPORT-2 PRESENTATION
A symptom-driven medical diagnosis support model based on machine learning te...
SaaS reusability assessment using machine learning techniques
A hybrid framework for wild animal classification using fine-tuned DenseNet12...
Introduction to MCP and A2A Protocols: Enabling Agent Communication
Basics of Cloud Computing - Cloud Ecosystem
Transform-Your-Streaming-Platform-with-AI-Driven-Quality-Engineering.pdf
AI-driven Assurance Across Your End-to-end Network With ThousandEyes
MENA-ECEONOMIC-CONTEXT-VC MENA-ECEONOMIC
CXOs-Are-you-still-doing-manual-DevOps-in-the-age-of-AI.pdf
Auditboard EB SOX Playbook 2023 edition.
Dell Pro Micro: Speed customer interactions, patient processing, and learning...
Transform-Quality-Engineering-with-AI-A-60-Day-Blueprint-for-Digital-Success.pdf
Co-training pseudo-labeling for text classification with support vector machi...
Rapid Prototyping: A lecture on prototyping techniques for interface design

CHI2007 talk on Conflicts in Wikipedia

  • 1. He Says, She Says: Conflict and Coordination in Wikipedia Aniket Kittur, Bongwon Suh, Bryan Pendleton, Ed Chi UCLA Augmented Social Cognition Group Palo Alto Research Center
  • 2. What is Wikipedia? “ Wikipedia is the best thing ever. Anyone in the world can write anything they want about any subject, so you know you’re getting the best possible information.” – Steve Carell, The Office
  • 8. Policy and procedure “ The degree of success that one meets in dealing with conflicts... often depends on the efficiency with which one can quote policy and precedent.” - Wikipedia admin (survey data)
  • 9. Collaborative work beneath the surface Visitors only look at article pages But much of Wikipedia comprised of other pages Conflict resolution, coordination, policies and procedures
  • 13. Costs of growth Increase in conflict and coordination costs Software development (Boehm, 1981; Brooks, 1975) MUDs/MOOs (Curtis, 1992; Dibbell, 1993) Mailing lists (Sproull & Kiesler, 1991) How has growth affected Wikipedia? Millions of new users and articles
  • 14. Infrastructure Analyze entire history of Wikipedia Every edit to every article Large amount of data 4+ million pages 58+ million revisions 800+ Gb as of June 2006 Distributed processing Hadoop distributed filesystem Map/reduce to process data in parallel
  • 15. Types of work Direct work Immediately consumable Indirect work Coordination, conflict Maintenance work Reverts, vandalism Article Talk, user, procedure
  • 16. Less direct work Decrease in proportion of edits to article page 70%
  • 17. More indirect work Increase in proportion of edits to user talk 8%
  • 18. More indirect work Increase in proportion of edits to user talk Increase in proportion of edits to procedure 11%
  • 19. More maintenance work Increase in proportion of edits that are reverts 7%
  • 20. More wasted work Increase in proportion of edits that are reverts Increase in proportion of edits reverting vandalism 1-2%
  • 21. Global level Conflict and coordination costs are growing Less direct work (articles) More indirect work (article talk, user, procedure) More maintenance work (reverts, vandalism)
  • 23. Conflict at the article level What defines conflict in articles? Build a characterization model of article conflict Identify page features and metrics associated with conflict Automatically identify high-conflict articles
  • 24. Page metrics Chose metrics for identifying conflict in articles Easily computable, scalable Article Reverts (#, by unique editors) Article, talk Minor edits (#, %) Article, talk Administrator edits (#, %) Article, talk Anonymous edits (#, %) Article, talk Links to other articles Article, talk Links from other articles Article, talk Unique editors / revisions Article, talk, article/talk Unique editors Article, talk, article/talk Page length Article, talk, article/talk Revisions (#) Page Type Metric type
  • 25. Defining conflict Operational definition for conflict Revisions tagged controversial Conflict revision count
  • 26. Machine learning Predict conflict from page metrics Training set of “controversial” pages Support vector machine regression predicting # controversial revisions (SMOreg; Smola & Scholkopf, 1998) Not just conflict/no conflict, but how much conflict
  • 27. Performance: Cross-validation 5x cross-validation, R 2 = 0.897
  • 28. Performance: Cross-validation 5x cross-validation, R 2 = 0.897
  • 29. Determinants of conflict —  Revisions (talk) —  Minor edits (talk) ˜  Unique editors (talk) —  Revisions (article) ˜  Unique editors (article) —  Anonymous edits (talk) ˜  Anonymous edits (article) Highly weighted metrics of conflict model:
  • 30. Identifying untagged articles Detect conflicts for unlabeled articles Majority of articles have never been conflict tagged Testing model generalization Applied model to untagged articles Sample rated by expert Wikipedians Significant positive correlation with predicted scores By rank correlation, p < 0.013 (Spearman’s rho)
  • 32. Conflict at the user level How can we identify conflict between users? Reverts as a proxy for user conflict Revert patterns between users Force directed layout to cluster users Group similar viewpoints Find conflicts between groups
  • 33. Dokdo/Takeshima opinion groups Group A Group B Group C Group D
  • 34. Terry Schiavo Mediators Sympathetic to parents Sympathetic to husband Anonymous (vandals/spammers)
  • 35. Summary: Characterizing Wikipedia Coordination costs and conflict are increasing Global-level: Trend identification Decrease in direct article work Increase in indirect coordination work Increase in maintenance work Article-level: Prediction using Machine learning Identify characteristics of article conflict Detect conflict-heavy articles needing extra attention User-level: User Conflict Visualization Make sense of user conflicts and identify shared viewpoints
  • 36. Future Work Applied to many domains Corporate memory (Socialtext) Intelligence gathering (Intellipedia) Scholarly research (Scholarpedia) Collaborative problem solving (Lostpedia) Application: Social Dashboard Identify high conflict articles Surface editing patterns to readers Route attention to articles that need it most
  • 38. He Says, She Says: Conflict and Coordination in Wikipedia Aniket Kittur, Bongwon Suh, Bryan Pendleton, Ed Chi UCLA Augmented Social Cognition Group Palo Alto Research Center Thank you!

Editor's Notes

  • #2: Thank you. Today I’m going to be talking about conflict and coordination in Wikipedia. This is joint work with... Most everyone knows that Wikipedia is an online encyclopedia that anyone can edit. But as I was putting this talk together I thought to myself “how can I describe what makes Wikipedia so special?” And luckily I found this video clip of Steve Carell from the TV show The Office describing it in a much more... interesting way than I possibly could.