SlideShare a Scribd company logo
SharePoint and Large Scale SQL Deployments - NZSPC
Considerations for large-scale SharePoint deployments on SQL ServerName:		Joel OlesonTitle:		Sr. Tech Prod MgrCompany:	Quest Software
SharePoint and Large Scale SQL Deployments - NZSPC
Audience PollNew to SharePoint?SQL Admins?Large-scale Implementation (>1TB) experience?Scalability or performance issues in SharePoint deployments?
Session Overview LightweightUnderstanding SharePoint databasesSQL PerformanceSQL Server 2008 with SharePointHeavyweightArchitectural Design ConsiderationsReal-world scenariosBusiness RequirementsLogical and Physical ArchitectureArchitectural Design Statistical ResultsAppendix: DB Sizes, Content Distribution…
=Lightweight
Real World ExamplesInformation based on real-world, large-scale SharePoint Implementations.Large software company (Microsoft)Intranet Portal for 120K usersGlobal Enterprise Collaboration Solution (~20TB)Scalable Hosting Solution (SharePoint Online)Large automotive manufacturerLoan Origination Application / Document Repository~50 Million content items (~6 TB)
Understanding the SharePoint Databases
Disk I/O DemandMost DemandMedium  DemandLow  Demand*Content..SearchConfigTempModel+SSPMasterTlogs* Except during backup and Indexing  + Except during Profile Import
Top Performance KillersIndexing/CrawlingBackup (SQL & Tape)Profile ImportMisc Timer Jobs – User Sync for large #s of UsersSTSADM Backup/RestoreLarge List OperationsHeavy User Operation List Import/Write
Content Db
Config
SQL Server 2008 with Windows Server 2008Transactional Performance with SQL Server 2008 Dramatically outperformed SQL 2005 on Win 2003.Compressed backup in the boxSupport for SQL External Blob StorageIncreased resiliencyTransparent EncryptionSee Performance Gains athttp://msdn.microsoft.com/en-us/library/dd263442.aspx
=heavyweight
Architectural Design ConsiderationsDatabase VolumesSeparate database volumes into unique LUN’s consisting of unique physical disk spindles.In a heavily read-oriented internet (portal) site, prioritize data over logs.Separate out Search database transaction log from content database transaction logs.
Architectural Design ConsiderationsSQL TempDB Data FilesOptimal TempDB data file sizes can be calculated using the following formula:[MAX DB SIZE (KB)] X [.25] / [# CORES] = DATA FILE SIZE (KB)Calculation result (starting size) should be roughly equal to 25% of the largest content or search DB.Use RAID 10; separate LUN from other database objects (content, search, etc…).“Autogrow” feature set to a fixed amount; if auto grow occurs, permanently increase TempDB size.TempDB Log file separated to unique LUN.
Architectural Design ConsiderationsContent Databases100 content databases per Web application100GB per content databaseCAUTION: DB locking issues reported in collaborative DM scenarios above 100GBNeed to ensure that you understand the issues based on number of users, usage profiles, etc…Service Level Agreement (SLA) requirements for backup and restore will also have an impact on this decision.
Architectural Design ConsiderationsContent Databases - ContinuedPre-construct and pre-sizeUse RAID 5 or RAID 10 logical unitsRAID 10 is the best choice when cost is not a concern. RAID 5 will be sufficient and will save on costs, since content databases tend to be more read intensive than write intensive.Multi-core computer running SQL ServerPrimary file group could consist of a data file for each CPU core present in SQL Server.
Architectural Design ConsiderationsDatabase MaintenanceSQL Server SP2 is needed if using the DB maintenance wizard (KB930887).Plan regular defrag of databasesPerformance - Average Disk Queue LengthSingle Digit values are optimal.Occasional double-digit values aren’t a large concern.Sustained triple-digit values require attention.
Architectural Design ConsiderationsPerformanceThe recommended practice for separating the database volume types for the transaction log files to unique LUN’s follows.Content Database Log Files.Search Database Log Files.Consider filegroups for search database
Architectural Design ConsiderationsTopologyA single list should not have more than 2,000 items per list container.A container represents the root of the list, as well as any folders within the list; a folder is a container because other list items are stored within it.Whitepaper: Working with large lists in Office SharePoint Server 2007 (Steve Peschka)http://go.microsoft.com/fwlink/?LinkId=95450Disk Drive Speed15K RPM recommended.IIS Application PoolsEnsure “Max Used Memory” setting utilizes all the available RAM in your WFE’s.
Architectural Design ConsiderationsSTSADM Command-line Tool and CreateSiteInNewDBOperationGary Lapoint STSADM Extensions for Site Collection DB maintenanceCodeplex.com/governance tools for archive & delete capture
Large Scale Manufacture
Real-world ScenariosAutomotive Mfgr. Business Requirements (Phase I)Loan Origination Application built on Office SharePoint Server 2007Ability to manage10.5 million images.System performance with a “normal” input load defined as receipt of 27,000 images per business day = 10 hours.Simulate user load to represent 200 users for search, view & update with 2x peak
Real-world ScenariosData Load Process (Phase I)Used KnowledgeLake Document Release EngineLoaded 9.17 documents/second per server Employs a high-volume, storage-based folder architecture within SharePoint to ensure UI responsiveness.Executed on 4 servers. Using this application, we were able to achieve:An average document load throughput of 36.6 documents per second!An average daily input of 3.17 million documents!10.5 million documents with only 28% utilization!
Real-world ScenariosData Load Process (Phase II)15 million documents consisting of Word (.docx), Excel (.xlsx), PowerPoint (.pptx) and Adobe PDF.Five Web Front-Ends were used for the load process.Peak Load Rate:24.3 docs per second/2.1 million documents per day.Average Load Rate:~1.9 million documents per day.Load Time:8 days.NOTE: Load rates included  automation process that created the PDF files.
What does the logical architecture look like?!
What does the physical architecture look like?!Scale OUT…Scale UP…
What does the site topology look like?!Phase I17 Divisional Site Collections / DB’sPhase II10 Departmental Site Collections / DB’s
What does the storage architecture look like?
Database SizesPhase I
Architectural Design Statistical ResultsPhase IDesigned Once / Built OnceNo architecture OR configuration changes were required after the initial build was completed.10.5+ million documents loaded into the system in approximately 60 hours!Full Crawl indexed 10 Million items in 32 hours!Average content database size for divisional breakouts was 60GB
Architectural Design Statistical ResultsPhase IISearch database size was 539GB.Lesson Learned: Large search database caused disk I/O contention; break this out into multiple data file allocations matching the number of core processors on SQL Server, and spread them over unique LUN’s.Total Index size was 162GB!Average Content database size for Divisional breakouts was 200.65GB!Average Content database size for Departmental breakouts was 137.60GB!
Large Scale Pharma
Real-world ScenariosPharmaceutical Business RequirementsCollaboration Portal built on Office SharePoint Server 2007Validate ~40TB of content storage.Identify performance characteristics and provide guidance around content database sizingFAST search integration
Real-world ScenariosData Load Process71,524,357 documents loaded across two SharePoint Farms 10.92 days!Content was spread across the farms into 165 unique content databases.6,240 Site Collections, each containing 10 sub-sites for a total of 62,400 sites.Database sizes were pre-configured to vary in size from 100GB to 350GB to determine performance and/or SLA impacts.
What does the logical architecture look like?!
What does the physical architecture look like?!
What does the site topology look like?!165 Content DB’s6,240 Site Collections10 Sub-Sites in each collection:62,400 Sites!
What does the storage architecture look like?
Architectural Design Statistical ResultsConclusionUser LoadsStress tests included 2 - 3,000 concurrent users.Based on the 10% rule, testing completed equated to an environment representing 300,000 users!RAWnumber of RPS during peak times is 1,469 at Pharma.773 RPS, which equates to 346.59 ACTUAL RPS!FAST Search IntegrationSuccessfully integrated FAST search capabilities, indexed content corpus and served search results as expected.
Large-Scale Case Study AvailableSharePoint Scalability and Performance WhitepaperContains majority of content you will see here, along with test results you won’t see here.TechNet topic: http://go.microsoft.com/fwlink/?LinkId=120901Word 2007 format: http://go.microsoft.com/fwlink/?LinkId=120881Word 2000-2003 format: http://go.microsoft.com/fwlink/?LinkId=120890PDF format: http://go.microsoft.com/fwlink/?LinkId=120891
question & answer
Appendix
Database SizesMPSC/Nissan Phase I
Database SizesMPSC/Nissan Phase II
Performance of Components Over TimeMPSC/Nissan Phase I14 individual performance tests were run to simulate various load scenarios.
How do we pull all this together?!PharmaContent Database DistributionSubstitute “F1” with SQL Server number to generate unique DB’sFarm 1: 2 SQLFarm 2: 1 SQL165 Content Databases!
How do we pull all this together?!PharmaData Load Statistics
Architectural Design Statistical ResultsTesting Results – 300GB Content Databases
Architectural Design Statistical ResultsTesting Results – 350GB Content Databases
Architectural Design Statistical ResultsTesting Results – 250GB Content Databases
Architectural Design Statistical ResultsTesting Results – 150GB Content Databases
Required Slide© 2009 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation.  Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation.  MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

More Related Content

What's hot (20)

PPTX
What SharePoint Admins need to know about SQL-Cinncinati
J.D. Wade
 
PPTX
SharePoint 2010 database maintenance
Matt Ranlett
 
PPTX
SharePoint 2010 – Installation and maintenance – best practices
Toni Frankola
 
PPTX
No Data Left Behind: A SharePoint 2013 Migration
Holly Plude
 
PPTX
Find a needle in Haystack: Facebook's storage system
LIN Yi
 
PDF
Auditing and Monitoring PostgreSQL/EPAS
EDB
 
PPT
Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010
Bhupesh Bansal
 
PDF
Mongodb
Apurva Vyas
 
PPTX
Application Development & Database Choices: Postgres Support for non Relation...
EDB
 
PPTX
Big data with HDFS and Mapreduce
senthil0809
 
PPTX
Ein Expertenleitfaden für die Migration von Legacy-Datenbanken zu PostgreSQL
EDB
 
PPTX
Ultimate SharePoint 2013 Infrastructure Best Practices Session - SPKSLO 2012
Michael Noel
 
PPTX
ГАННА КАПЛУН «noSQL vs SQL: порівняння використання реляційних та нереляційни...
QADay
 
PDF
Highlights from SharePoint Conference 2011
C/D/H Technology Consultants
 
PPTX
Responsive Web Design ~ Best Practices for Maximizing ROI
Juan Carlos Duron
 
PDF
Situation
Anne Plancius
 
PPT
Preparing for Upgrade to SharePoint 2010 Today
Joel Oleson
 
PDF
HDFS Analysis for Small Files
DataWorks Summit/Hadoop Summit
 
PPTX
Connected at the hip for MS BI: SharePoint and SQL
J.D. Wade
 
PDF
GSA Webinar - June 2, 2011
Fishbowl Solutions
 
What SharePoint Admins need to know about SQL-Cinncinati
J.D. Wade
 
SharePoint 2010 database maintenance
Matt Ranlett
 
SharePoint 2010 – Installation and maintenance – best practices
Toni Frankola
 
No Data Left Behind: A SharePoint 2013 Migration
Holly Plude
 
Find a needle in Haystack: Facebook's storage system
LIN Yi
 
Auditing and Monitoring PostgreSQL/EPAS
EDB
 
Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010
Bhupesh Bansal
 
Mongodb
Apurva Vyas
 
Application Development & Database Choices: Postgres Support for non Relation...
EDB
 
Big data with HDFS and Mapreduce
senthil0809
 
Ein Expertenleitfaden für die Migration von Legacy-Datenbanken zu PostgreSQL
EDB
 
Ultimate SharePoint 2013 Infrastructure Best Practices Session - SPKSLO 2012
Michael Noel
 
ГАННА КАПЛУН «noSQL vs SQL: порівняння використання реляційних та нереляційни...
QADay
 
Highlights from SharePoint Conference 2011
C/D/H Technology Consultants
 
Responsive Web Design ~ Best Practices for Maximizing ROI
Juan Carlos Duron
 
Situation
Anne Plancius
 
Preparing for Upgrade to SharePoint 2010 Today
Joel Oleson
 
HDFS Analysis for Small Files
DataWorks Summit/Hadoop Summit
 
Connected at the hip for MS BI: SharePoint and SQL
J.D. Wade
 
GSA Webinar - June 2, 2011
Fishbowl Solutions
 

Viewers also liked (8)

PPS
UTPL_EDUCACION-AMIGOS COMO TU
macsanchez19
 
PPT
W I S P
Thmeia
 
PPSX
ERI Tutorial
gregcook3
 
PDF
Mike Vaillancourt Art Direction/Graphic Design/Illustration
MDVaillancourt
 
PDF
Sample SEO Best Practice Guide
Noa Marcos
 
PPT
Cards For Cancer
spiritjump
 
PDF
Guia para Menores en Internet
Noa Marcos
 
PPTX
Best Practices to SharePoint Architecture Fundamentals NZ & AUS
guest7c2e070
 
UTPL_EDUCACION-AMIGOS COMO TU
macsanchez19
 
W I S P
Thmeia
 
ERI Tutorial
gregcook3
 
Mike Vaillancourt Art Direction/Graphic Design/Illustration
MDVaillancourt
 
Sample SEO Best Practice Guide
Noa Marcos
 
Cards For Cancer
spiritjump
 
Guia para Menores en Internet
Noa Marcos
 
Best Practices to SharePoint Architecture Fundamentals NZ & AUS
guest7c2e070
 
Ad

Similar to SharePoint and Large Scale SQL Deployments - NZSPC (20)

PPTX
SharePoint Intelligence Real World Business Workflow With Share Point Designe...
Ivan Sanders
 
PPTX
SPS Kansas City: What SharePoint Admin need to know about SQL
J.D. Wade
 
PPT
Large Scale SQL Considerations for SharePoint Deployments
Joel Oleson
 
PPTX
Real world business workflow with SharePoint designer 2013
Ivan Sanders
 
PPT
Asia Pacific SharePoint Capacity Planning by Joel Oleson
Joel Oleson
 
PDF
Share point 2010 performance and capacity planning best practices
Eric Shupps
 
PPTX
Highly available and scalable architectures
Phil Wicklund
 
PPTX
SharePoint Performance
Jeroen Schoenmakers
 
PPTX
Sql Health in a SharePoint environment
Enrique Lima
 
PDF
Introduction to SharePoint for SQLserver DBAs
Steve Knutson
 
PPTX
What SQL DBA's need to know about SharePoint
J.D. Wade
 
PPTX
What SQL DBA's need to know about SharePoint-St. Louis 2013
J.D. Wade
 
PPTX
Building high performance and scalable share point applications
Talbott Crowell
 
PPTX
Australia SharePoint Conference 2012 - SharePoint Performance - Tales from th...
Chris McNulty
 
PDF
SPCA2013 - Best Practices & Considerations for Designing Your SharePoint Logi...
NCCOMMS
 
PPTX
Practical SharePoint 2010 Architecture Planning
Chakkaradeep Chandran
 
PPTX
SharePoint Administration
Muhammad Zeeshan Tahir
 
PPT
Automating SQL Server Database Creation for SharePoint
Talbott Crowell
 
PPTX
Intranet Case Studies - HNL SPS
Michael Doyle
 
PPT
SharePoint Topology
Information Technology
 
SharePoint Intelligence Real World Business Workflow With Share Point Designe...
Ivan Sanders
 
SPS Kansas City: What SharePoint Admin need to know about SQL
J.D. Wade
 
Large Scale SQL Considerations for SharePoint Deployments
Joel Oleson
 
Real world business workflow with SharePoint designer 2013
Ivan Sanders
 
Asia Pacific SharePoint Capacity Planning by Joel Oleson
Joel Oleson
 
Share point 2010 performance and capacity planning best practices
Eric Shupps
 
Highly available and scalable architectures
Phil Wicklund
 
SharePoint Performance
Jeroen Schoenmakers
 
Sql Health in a SharePoint environment
Enrique Lima
 
Introduction to SharePoint for SQLserver DBAs
Steve Knutson
 
What SQL DBA's need to know about SharePoint
J.D. Wade
 
What SQL DBA's need to know about SharePoint-St. Louis 2013
J.D. Wade
 
Building high performance and scalable share point applications
Talbott Crowell
 
Australia SharePoint Conference 2012 - SharePoint Performance - Tales from th...
Chris McNulty
 
SPCA2013 - Best Practices & Considerations for Designing Your SharePoint Logi...
NCCOMMS
 
Practical SharePoint 2010 Architecture Planning
Chakkaradeep Chandran
 
SharePoint Administration
Muhammad Zeeshan Tahir
 
Automating SQL Server Database Creation for SharePoint
Talbott Crowell
 
Intranet Case Studies - HNL SPS
Michael Doyle
 
SharePoint Topology
Information Technology
 
Ad

Recently uploaded (20)

PDF
Using FME to Develop Self-Service CAD Applications for a Major UK Police Force
Safe Software
 
PPTX
AUTOMATION AND ROBOTICS IN PHARMA INDUSTRY.pptx
sameeraaabegumm
 
PDF
HCIP-Data Center Facility Deployment V2.0 Training Material (Without Remarks ...
mcastillo49
 
PDF
NewMind AI - Journal 100 Insights After The 100th Issue
NewMind AI
 
PDF
Empower Inclusion Through Accessible Java Applications
Ana-Maria Mihalceanu
 
PPTX
OpenID AuthZEN - Analyst Briefing July 2025
David Brossard
 
PDF
Jak MŚP w Europie Środkowo-Wschodniej odnajdują się w świecie AI
dominikamizerska1
 
PPTX
Q2 FY26 Tableau User Group Leader Quarterly Call
lward7
 
PDF
CIFDAQ Market Insights for July 7th 2025
CIFDAQ
 
PDF
Newgen Beyond Frankenstein_Build vs Buy_Digital_version.pdf
darshakparmar
 
PDF
CIFDAQ Market Wrap for the week of 4th July 2025
CIFDAQ
 
PDF
"AI Transformation: Directions and Challenges", Pavlo Shaternik
Fwdays
 
PPTX
From Sci-Fi to Reality: Exploring AI Evolution
Svetlana Meissner
 
PPTX
COMPARISON OF RASTER ANALYSIS TOOLS OF QGIS AND ARCGIS
Sharanya Sarkar
 
PDF
From Code to Challenge: Crafting Skill-Based Games That Engage and Reward
aiyshauae
 
PDF
Transcript: New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
PDF
July Patch Tuesday
Ivanti
 
PDF
Mastering Financial Management in Direct Selling
Epixel MLM Software
 
PDF
LLMs.txt: Easily Control How AI Crawls Your Site
Keploy
 
PPTX
Building Search Using OpenSearch: Limitations and Workarounds
Sease
 
Using FME to Develop Self-Service CAD Applications for a Major UK Police Force
Safe Software
 
AUTOMATION AND ROBOTICS IN PHARMA INDUSTRY.pptx
sameeraaabegumm
 
HCIP-Data Center Facility Deployment V2.0 Training Material (Without Remarks ...
mcastillo49
 
NewMind AI - Journal 100 Insights After The 100th Issue
NewMind AI
 
Empower Inclusion Through Accessible Java Applications
Ana-Maria Mihalceanu
 
OpenID AuthZEN - Analyst Briefing July 2025
David Brossard
 
Jak MŚP w Europie Środkowo-Wschodniej odnajdują się w świecie AI
dominikamizerska1
 
Q2 FY26 Tableau User Group Leader Quarterly Call
lward7
 
CIFDAQ Market Insights for July 7th 2025
CIFDAQ
 
Newgen Beyond Frankenstein_Build vs Buy_Digital_version.pdf
darshakparmar
 
CIFDAQ Market Wrap for the week of 4th July 2025
CIFDAQ
 
"AI Transformation: Directions and Challenges", Pavlo Shaternik
Fwdays
 
From Sci-Fi to Reality: Exploring AI Evolution
Svetlana Meissner
 
COMPARISON OF RASTER ANALYSIS TOOLS OF QGIS AND ARCGIS
Sharanya Sarkar
 
From Code to Challenge: Crafting Skill-Based Games That Engage and Reward
aiyshauae
 
Transcript: New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
July Patch Tuesday
Ivanti
 
Mastering Financial Management in Direct Selling
Epixel MLM Software
 
LLMs.txt: Easily Control How AI Crawls Your Site
Keploy
 
Building Search Using OpenSearch: Limitations and Workarounds
Sease
 

SharePoint and Large Scale SQL Deployments - NZSPC

  • 2. Considerations for large-scale SharePoint deployments on SQL ServerName: Joel OlesonTitle: Sr. Tech Prod MgrCompany: Quest Software
  • 4. Audience PollNew to SharePoint?SQL Admins?Large-scale Implementation (>1TB) experience?Scalability or performance issues in SharePoint deployments?
  • 5. Session Overview LightweightUnderstanding SharePoint databasesSQL PerformanceSQL Server 2008 with SharePointHeavyweightArchitectural Design ConsiderationsReal-world scenariosBusiness RequirementsLogical and Physical ArchitectureArchitectural Design Statistical ResultsAppendix: DB Sizes, Content Distribution…
  • 7. Real World ExamplesInformation based on real-world, large-scale SharePoint Implementations.Large software company (Microsoft)Intranet Portal for 120K usersGlobal Enterprise Collaboration Solution (~20TB)Scalable Hosting Solution (SharePoint Online)Large automotive manufacturerLoan Origination Application / Document Repository~50 Million content items (~6 TB)
  • 9. Disk I/O DemandMost DemandMedium DemandLow Demand*Content..SearchConfigTempModel+SSPMasterTlogs* Except during backup and Indexing + Except during Profile Import
  • 10. Top Performance KillersIndexing/CrawlingBackup (SQL & Tape)Profile ImportMisc Timer Jobs – User Sync for large #s of UsersSTSADM Backup/RestoreLarge List OperationsHeavy User Operation List Import/Write
  • 13. SQL Server 2008 with Windows Server 2008Transactional Performance with SQL Server 2008 Dramatically outperformed SQL 2005 on Win 2003.Compressed backup in the boxSupport for SQL External Blob StorageIncreased resiliencyTransparent EncryptionSee Performance Gains athttp://msdn.microsoft.com/en-us/library/dd263442.aspx
  • 15. Architectural Design ConsiderationsDatabase VolumesSeparate database volumes into unique LUN’s consisting of unique physical disk spindles.In a heavily read-oriented internet (portal) site, prioritize data over logs.Separate out Search database transaction log from content database transaction logs.
  • 16. Architectural Design ConsiderationsSQL TempDB Data FilesOptimal TempDB data file sizes can be calculated using the following formula:[MAX DB SIZE (KB)] X [.25] / [# CORES] = DATA FILE SIZE (KB)Calculation result (starting size) should be roughly equal to 25% of the largest content or search DB.Use RAID 10; separate LUN from other database objects (content, search, etc…).“Autogrow” feature set to a fixed amount; if auto grow occurs, permanently increase TempDB size.TempDB Log file separated to unique LUN.
  • 17. Architectural Design ConsiderationsContent Databases100 content databases per Web application100GB per content databaseCAUTION: DB locking issues reported in collaborative DM scenarios above 100GBNeed to ensure that you understand the issues based on number of users, usage profiles, etc…Service Level Agreement (SLA) requirements for backup and restore will also have an impact on this decision.
  • 18. Architectural Design ConsiderationsContent Databases - ContinuedPre-construct and pre-sizeUse RAID 5 or RAID 10 logical unitsRAID 10 is the best choice when cost is not a concern. RAID 5 will be sufficient and will save on costs, since content databases tend to be more read intensive than write intensive.Multi-core computer running SQL ServerPrimary file group could consist of a data file for each CPU core present in SQL Server.
  • 19. Architectural Design ConsiderationsDatabase MaintenanceSQL Server SP2 is needed if using the DB maintenance wizard (KB930887).Plan regular defrag of databasesPerformance - Average Disk Queue LengthSingle Digit values are optimal.Occasional double-digit values aren’t a large concern.Sustained triple-digit values require attention.
  • 20. Architectural Design ConsiderationsPerformanceThe recommended practice for separating the database volume types for the transaction log files to unique LUN’s follows.Content Database Log Files.Search Database Log Files.Consider filegroups for search database
  • 21. Architectural Design ConsiderationsTopologyA single list should not have more than 2,000 items per list container.A container represents the root of the list, as well as any folders within the list; a folder is a container because other list items are stored within it.Whitepaper: Working with large lists in Office SharePoint Server 2007 (Steve Peschka)http://go.microsoft.com/fwlink/?LinkId=95450Disk Drive Speed15K RPM recommended.IIS Application PoolsEnsure “Max Used Memory” setting utilizes all the available RAM in your WFE’s.
  • 22. Architectural Design ConsiderationsSTSADM Command-line Tool and CreateSiteInNewDBOperationGary Lapoint STSADM Extensions for Site Collection DB maintenanceCodeplex.com/governance tools for archive & delete capture
  • 24. Real-world ScenariosAutomotive Mfgr. Business Requirements (Phase I)Loan Origination Application built on Office SharePoint Server 2007Ability to manage10.5 million images.System performance with a “normal” input load defined as receipt of 27,000 images per business day = 10 hours.Simulate user load to represent 200 users for search, view & update with 2x peak
  • 25. Real-world ScenariosData Load Process (Phase I)Used KnowledgeLake Document Release EngineLoaded 9.17 documents/second per server Employs a high-volume, storage-based folder architecture within SharePoint to ensure UI responsiveness.Executed on 4 servers. Using this application, we were able to achieve:An average document load throughput of 36.6 documents per second!An average daily input of 3.17 million documents!10.5 million documents with only 28% utilization!
  • 26. Real-world ScenariosData Load Process (Phase II)15 million documents consisting of Word (.docx), Excel (.xlsx), PowerPoint (.pptx) and Adobe PDF.Five Web Front-Ends were used for the load process.Peak Load Rate:24.3 docs per second/2.1 million documents per day.Average Load Rate:~1.9 million documents per day.Load Time:8 days.NOTE: Load rates included automation process that created the PDF files.
  • 27. What does the logical architecture look like?!
  • 28. What does the physical architecture look like?!Scale OUT…Scale UP…
  • 29. What does the site topology look like?!Phase I17 Divisional Site Collections / DB’sPhase II10 Departmental Site Collections / DB’s
  • 30. What does the storage architecture look like?
  • 32. Architectural Design Statistical ResultsPhase IDesigned Once / Built OnceNo architecture OR configuration changes were required after the initial build was completed.10.5+ million documents loaded into the system in approximately 60 hours!Full Crawl indexed 10 Million items in 32 hours!Average content database size for divisional breakouts was 60GB
  • 33. Architectural Design Statistical ResultsPhase IISearch database size was 539GB.Lesson Learned: Large search database caused disk I/O contention; break this out into multiple data file allocations matching the number of core processors on SQL Server, and spread them over unique LUN’s.Total Index size was 162GB!Average Content database size for Divisional breakouts was 200.65GB!Average Content database size for Departmental breakouts was 137.60GB!
  • 35. Real-world ScenariosPharmaceutical Business RequirementsCollaboration Portal built on Office SharePoint Server 2007Validate ~40TB of content storage.Identify performance characteristics and provide guidance around content database sizingFAST search integration
  • 36. Real-world ScenariosData Load Process71,524,357 documents loaded across two SharePoint Farms 10.92 days!Content was spread across the farms into 165 unique content databases.6,240 Site Collections, each containing 10 sub-sites for a total of 62,400 sites.Database sizes were pre-configured to vary in size from 100GB to 350GB to determine performance and/or SLA impacts.
  • 37. What does the logical architecture look like?!
  • 38. What does the physical architecture look like?!
  • 39. What does the site topology look like?!165 Content DB’s6,240 Site Collections10 Sub-Sites in each collection:62,400 Sites!
  • 40. What does the storage architecture look like?
  • 41. Architectural Design Statistical ResultsConclusionUser LoadsStress tests included 2 - 3,000 concurrent users.Based on the 10% rule, testing completed equated to an environment representing 300,000 users!RAWnumber of RPS during peak times is 1,469 at Pharma.773 RPS, which equates to 346.59 ACTUAL RPS!FAST Search IntegrationSuccessfully integrated FAST search capabilities, indexed content corpus and served search results as expected.
  • 42. Large-Scale Case Study AvailableSharePoint Scalability and Performance WhitepaperContains majority of content you will see here, along with test results you won’t see here.TechNet topic: http://go.microsoft.com/fwlink/?LinkId=120901Word 2007 format: http://go.microsoft.com/fwlink/?LinkId=120881Word 2000-2003 format: http://go.microsoft.com/fwlink/?LinkId=120890PDF format: http://go.microsoft.com/fwlink/?LinkId=120891
  • 47. Performance of Components Over TimeMPSC/Nissan Phase I14 individual performance tests were run to simulate various load scenarios.
  • 48. How do we pull all this together?!PharmaContent Database DistributionSubstitute “F1” with SQL Server number to generate unique DB’sFarm 1: 2 SQLFarm 2: 1 SQL165 Content Databases!
  • 49. How do we pull all this together?!PharmaData Load Statistics
  • 50. Architectural Design Statistical ResultsTesting Results – 300GB Content Databases
  • 51. Architectural Design Statistical ResultsTesting Results – 350GB Content Databases
  • 52. Architectural Design Statistical ResultsTesting Results – 250GB Content Databases
  • 53. Architectural Design Statistical ResultsTesting Results – 150GB Content Databases
  • 54. Required Slide© 2009 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

Editor's Notes

  • #19: Pre-construct and pre-size your content databases. Once the content database size has been specified, it is recommended that the database be created using a script that appropriately generates the empty database. Note that the “Autogrow” feature should be left on to prevent any future issues.Place the content database file or files on RAID 5 or RAID 10 logical units. RAID 10 is the best choice when cost is not a concern. RAID 5 will be sufficient and will save on costs, since content databases tend to be more read intensive than write intensive.For a large-scale document management solution, with a multi-core computer running SQL Server, the primary file group for the content database could potentially consist of a data file for each CPU core present in SQL Server. If possible, move each data file to separate logical units consisting of unique physical disk spindles.Database storage for content items will be between 1.2 and 1.5 time the raw file size when stored in SharePoint.
  • #44: If you would like to host your demo on the Virtual Server, please use the myVPC demo slide, not this slide.