SlideShare a Scribd company logo
MYSQL Performance Tuning



Philip Zhong



© 2011 Cisco and/or its affiliates. All rights reserved.   Cisco Confidential   1
© 2011 Cisco and/or its affiliates. All rights reserved.   Cisco Confidential   2
© 2011 Cisco and/or its affiliates. All rights reserved.   Cisco Confidential   3
• Types of I/O schedulers (introduced in 2.6.10: RHEL5)

        – noop: Sorting incoming i/o requests by logical block
       address, that’s all
        – deadlilne: Prioritize read (sync) requests rather than write
       requests (async) to some extent (to avoid “write-starving-reads”
       problem)
         – cfq(default): Fairly scheduling i/o requests per i/o thread
         – anticipatory: Removed in 2.6.33 (bad scheduler. Don’t use it)
• Default is cfq, but noop / deadline is better in many cases

         # echo noop > /sys/block/sdX/queue/scheduler


© 2011 Cisco and/or its affiliates. All rights reserved.                 Cisco Confidential   4
© 2011 Cisco and/or its affiliates. All rights reserved.   Cisco Confidential   5
• Make sure to use Flash Back Write Cache(FBWC) or
        Battery Backed up Write Cache (BBWC) on raid cards
            – 10,000+ fsync() per second, without BBWC less than 200 on HDD
            – Disable write cache on disks for safety reasons
• Do not set “write barrier” on file systems (enabled by default in some cases)
            – Write-through to disks even though BBWC is enabled (very slow)
            – ext3: mount -o barrier=0
            – xfs: mount -o nobarrier
            – drbd: no-disk-barrier in drbd.conf
• Consider disabling atime updates on files and directories
            – mount -o noatime,nodiratime

© 2011 Cisco and/or its affiliates. All rights reserved.                       Cisco Confidential   6
© 2011 Cisco and/or its affiliates. All rights reserved.   Cisco Confidential   7
• Put sequentially written files on HDD
          – ibdata, ib_logfile, binary log files
          – HDD is fast enough for sequential writes
          – Write performance deterioration can be mitigated
          – Life expectancy of SSD will be longer
• Put randomly accessed files on SSD
          – *ibd files, index files(MYI), data files(MYD)
          – SSD is 10x -100x faster for random reads than HDD
          – Archive less active tables/records to HDD
          – SSD is still much expensive than HDD
• Use 5.1 InnoDB Plugin
          – Higher scalability & concurrency matters on faster storage
© 2011 Cisco and/or its affiliates. All rights reserved.                 Cisco Confidential   8
© 2011 Cisco and/or its affiliates. All rights reserved.   Cisco Confidential   9
• RAM access speed is much faster than HDD/SSD

       RAM: -60ns
       HDD: -5ms
       SSD: 100-500us
• 16-100+GB RAM is now pretty common

• Sharding hot application data is important

• Do not create unnecessary indexes

• Delete records or move to archived tables, to keep hot tables
       smaller


© 2011 Cisco and/or its affiliates. All rights reserved.     Cisco Confidential   10
• Network

• CPU

• ……




© 2011 Cisco and/or its affiliates. All rights reserved.   Cisco Confidential   11
© 2011 Cisco and/or its affiliates. All rights reserved.   Cisco Confidential   12
© 2011 Cisco and/or its affiliates. All rights reserved.   Cisco Confidential   13
© 2011 Cisco and/or its affiliates. All rights reserved.   Cisco Confidential   14
© 2011 Cisco and/or its affiliates. All rights reserved.   Cisco Confidential   15
• EXPLAIN SELECT

         – Tells you what MYSQL is thinking
         – Which keys (indexes) can it use
         – Which keys will it use
         – How many rows must it examine (roughly)




© 2011 Cisco and/or its affiliates. All rights reserved.   Cisco Confidential   16
• EXPLAIN doesn’t tell you anything about how triggers, stored
       functions, or UDFs will affect your query.
• It doesn’t work for stored procedures, although you can extract
       the queries manually and EXPLAIN them individually.
• Some of the statistics it shows are estimates and can be very
       inaccurate.
• It doesn’t show you everything there is to know about a query’s
       execution plan.
• It doesn’t distinguish between some things with the same name.
       For example, it uses “file sort” for in-memory sorts and for
       temporary files, and it displays “Using temporary” for temporary
       tables on disk and in memory. It can be misleading

© 2011 Cisco and/or its affiliates. All rights reserved.          Cisco Confidential   17
© 2011 Cisco and/or its affiliates. All rights reserved.   Cisco Confidential   18
© 2011 Cisco and/or its affiliates. All rights reserved.   Cisco Confidential   19
SIMPLE:Simple SELECT (not using UNION or subqueries)
                     PRIMARY: outer select.
                     UNION:Second or later SELECT statement in a UNION
                     DEPENDENT UNION:Second or later SELECT statement in a UNION,
                     dependent on outer query
                     UNION RESULT:Result of a UNION.
                     SUBQUERY:First SELECT in subquery
                     DEPENDENT SUBQUERY: First SELECT in subquery, dependent on outer query
                     DERIVED:Derived table SELECT (subquery in FROM clause)
                     UNCACHEABLE SUBQUERY:A subquery for which the result cannot be cached and
                     must be reevaluated for each row of the outer query
                     UNCACHEABLE UNION:The second or later select in a UNION that belongs to an
                       uncacheable subquery (see UNCACHEABLE SUBQUERY)


© 2011 Cisco and/or its affiliates. All rights reserved.                                 Cisco Confidential   20
© 2011 Cisco and/or its affiliates. All rights reserved.   Cisco Confidential   21
© 2011 Cisco and/or its affiliates. All rights reserved.   Cisco Confidential   22
• Mysql>SET profiling = 1;

• Mysql>SHOW PROFILES;

• Mysql>SHOW PROFILE CPU FOR QUERY 1;




© 2011 Cisco and/or its affiliates. All rights reserved.   Cisco Confidential   23
© 2011 Cisco and/or its affiliates. All rights reserved.   Cisco Confidential   24
select tag_id,tag_text from Tags where tag_id+1>1 and tag_id+1<11;




© 2011 Cisco and/or its affiliates. All rights reserved.            Cisco Confidential   25
© 2011 Cisco and/or its affiliates. All rights reserved.   Cisco Confidential   26
© 2011 Cisco and/or its affiliates. All rights reserved.   Cisco Confidential   27
© 2011 Cisco and/or its affiliates. All rights reserved.   Cisco Confidential   28
© 2011 Cisco and/or its affiliates. All rights reserved.   Cisco Confidential   29
© 2011 Cisco and/or its affiliates. All rights reserved.   Cisco Confidential   30
Thank you.




© 2011 Cisco and/or its affiliates. All rights reserved.   Cisco Confidential   31

More Related Content

Viewers also liked (20)

PDF
MySQL Performance Tuning: Top 10 Tips
OSSCube
 
PDF
MySQL Performance Tuning für Entwickler
FromDual GmbH
 
PDF
MySQL Server Settings Tuning
guest5ca94b
 
PDF
浅析My sql事务隔离级别与锁 seanlook
晓 周
 
KEY
Perf Tuning Short
Ligaya Turmelle
 
PDF
MySQL Performance Tuning Variables
FromDual GmbH
 
PDF
MySQL Performance Tuning
FromDual GmbH
 
PDF
Performance Tuning Best Practices
webhostingguy
 
PDF
Join-fu: The Art of SQL Tuning for MySQL
ZendCon
 
PPTX
MySQL Indexing - Best practices for MySQL 5.6
MYXPLAIN
 
PDF
MySQL查询优化浅析
frogd
 
PDF
MySQL Manchester TT - Performance Tuning
Mark Swarbrick
 
PDF
Mysql Explain Explained
Jeremy Coates
 
PDF
MySQL Performance Tuning. Part 1: MySQL Configuration (includes MySQL 5.7)
Aurimas Mikalauskas
 
PDF
How to Design Indexes, Really
Karwin Software Solutions LLC
 
PPTX
Assessing enablers and constrainers of graduation
futureagricultures
 
PDF
affTA03 - BAB III
Muhammad Affandes
 
PPT
Adore global pvt ltd
Adore Global Pvt. Ltd
 
PPT
Новогодний счастливый купон
Елена Шальнова
 
DOCX
Ig1 assignment 2011_to_2012_updated_17.01.12
FirstClassProductions
 
MySQL Performance Tuning: Top 10 Tips
OSSCube
 
MySQL Performance Tuning für Entwickler
FromDual GmbH
 
MySQL Server Settings Tuning
guest5ca94b
 
浅析My sql事务隔离级别与锁 seanlook
晓 周
 
Perf Tuning Short
Ligaya Turmelle
 
MySQL Performance Tuning Variables
FromDual GmbH
 
MySQL Performance Tuning
FromDual GmbH
 
Performance Tuning Best Practices
webhostingguy
 
Join-fu: The Art of SQL Tuning for MySQL
ZendCon
 
MySQL Indexing - Best practices for MySQL 5.6
MYXPLAIN
 
MySQL查询优化浅析
frogd
 
MySQL Manchester TT - Performance Tuning
Mark Swarbrick
 
Mysql Explain Explained
Jeremy Coates
 
MySQL Performance Tuning. Part 1: MySQL Configuration (includes MySQL 5.7)
Aurimas Mikalauskas
 
How to Design Indexes, Really
Karwin Software Solutions LLC
 
Assessing enablers and constrainers of graduation
futureagricultures
 
affTA03 - BAB III
Muhammad Affandes
 
Adore global pvt ltd
Adore Global Pvt. Ltd
 
Новогодний счастливый купон
Елена Шальнова
 
Ig1 assignment 2011_to_2012_updated_17.01.12
FirstClassProductions
 

Similar to Mysql performance tuning (20)

PDF
DEFCON 23 - Etienne Martineau - inter vm data exfiltration
Felipe Prado
 
PPTX
Puppet for Production in WebEx - PuppetConf 2013
Puppet
 
PPTX
Cisco UCS Application acceleration data optimization
solarisyougood
 
PPTX
BigData Clusters Redefined
DataWorks Summit
 
PPTX
Considerations for Operating An OpenStack Cloud
Mark Voelker
 
PPTX
Devicemgmt
xyxz
 
PDF
Oracle Solaris 11.1 New Features
Orgad Kimchi
 
PDF
Chapter 2 Configure a Network Operating System
newbie2019
 
PPTX
Brandon-Ellis-Showcase-PPT
Brandon Ellis
 
PPT
les_02.ppt of the Oracle course train_2 file
YulinLiu27
 
PDF
BRKINI-3287.pdf
tuancq77
 
PDF
제3회난공불락 오픈소스 인프라세미나 - lustre
Tommy Lee
 
PDF
Cisco UCS (Unified Computing System)
NetWize
 
PDF
I can\'t believe this is butter - A Tour of btrfs
Avi Miller
 
PDF
I/O Microbenchmarking with Oracle in Mind
Bob Sneed
 
PPTX
Nagios-yating
yating yang
 
PDF
Поиск и устранение неисправностей в вычислительной системе Cisco UCS
Cisco Russia
 
PDF
Managing Your Cisco Datacenter Network with Ansible
fmaccioni
 
PDF
AnsibleFest London 2016 - managing your cisco datacenter network with ansible
fmaccioni
 
PPTX
Considerations for Operating an OpenStack Cloud
All Things Open
 
DEFCON 23 - Etienne Martineau - inter vm data exfiltration
Felipe Prado
 
Puppet for Production in WebEx - PuppetConf 2013
Puppet
 
Cisco UCS Application acceleration data optimization
solarisyougood
 
BigData Clusters Redefined
DataWorks Summit
 
Considerations for Operating An OpenStack Cloud
Mark Voelker
 
Devicemgmt
xyxz
 
Oracle Solaris 11.1 New Features
Orgad Kimchi
 
Chapter 2 Configure a Network Operating System
newbie2019
 
Brandon-Ellis-Showcase-PPT
Brandon Ellis
 
les_02.ppt of the Oracle course train_2 file
YulinLiu27
 
BRKINI-3287.pdf
tuancq77
 
제3회난공불락 오픈소스 인프라세미나 - lustre
Tommy Lee
 
Cisco UCS (Unified Computing System)
NetWize
 
I can\'t believe this is butter - A Tour of btrfs
Avi Miller
 
I/O Microbenchmarking with Oracle in Mind
Bob Sneed
 
Nagios-yating
yating yang
 
Поиск и устранение неисправностей в вычислительной системе Cisco UCS
Cisco Russia
 
Managing Your Cisco Datacenter Network with Ansible
fmaccioni
 
AnsibleFest London 2016 - managing your cisco datacenter network with ansible
fmaccioni
 
Considerations for Operating an OpenStack Cloud
All Things Open
 
Ad

More from Philip Zhong (11)

PPTX
How to Implement Distributed Data Store
Philip Zhong
 
PPT
MongoDB Knowledge Shareing
Philip Zhong
 
PPT
Adapter Poxy Pattern
Philip Zhong
 
PPTX
How to estimate_oracle_cost
Philip Zhong
 
PDF
Mongo db program_installation_guide
Philip Zhong
 
PDF
Mongo db sharding_cluster_installation_guide
Philip Zhong
 
PDF
Vitess percona 2012
Philip Zhong
 
PDF
Distributed_Database_System
Philip Zhong
 
PPTX
Mysql5.1 character set testing
Philip Zhong
 
PPTX
Compare mysql5.1.50 mysql5.5.8
Philip Zhong
 
PPTX
Mysql handle socket
Philip Zhong
 
How to Implement Distributed Data Store
Philip Zhong
 
MongoDB Knowledge Shareing
Philip Zhong
 
Adapter Poxy Pattern
Philip Zhong
 
How to estimate_oracle_cost
Philip Zhong
 
Mongo db program_installation_guide
Philip Zhong
 
Mongo db sharding_cluster_installation_guide
Philip Zhong
 
Vitess percona 2012
Philip Zhong
 
Distributed_Database_System
Philip Zhong
 
Mysql5.1 character set testing
Philip Zhong
 
Compare mysql5.1.50 mysql5.5.8
Philip Zhong
 
Mysql handle socket
Philip Zhong
 
Ad

Recently uploaded (20)

PDF
[Newgen] NewgenONE Marvin Brochure 1.pdf
darshakparmar
 
PDF
Bitcoin for Millennials podcast with Bram, Power Laws of Bitcoin
Stephen Perrenod
 
DOCX
Python coding for beginners !! Start now!#
Rajni Bhardwaj Grover
 
PPTX
Agentforce World Tour Toronto '25 - MCP with MuleSoft
Alexandra N. Martinez
 
PDF
Peak of Data & AI Encore AI-Enhanced Workflows for the Real World
Safe Software
 
PPTX
AI Penetration Testing Essentials: A Cybersecurity Guide for 2025
defencerabbit Team
 
PDF
What’s my job again? Slides from Mark Simos talk at 2025 Tampa BSides
Mark Simos
 
PPTX
COMPARISON OF RASTER ANALYSIS TOOLS OF QGIS AND ARCGIS
Sharanya Sarkar
 
PPTX
MuleSoft MCP Support (Model Context Protocol) and Use Case Demo
shyamraj55
 
PPTX
The Project Compass - GDG on Campus MSIT
dscmsitkol
 
PDF
Agentic AI lifecycle for Enterprise Hyper-Automation
Debmalya Biswas
 
PPTX
Seamless Tech Experiences Showcasing Cross-Platform App Design.pptx
presentifyai
 
PPTX
Q2 FY26 Tableau User Group Leader Quarterly Call
lward7
 
PDF
Book industry state of the nation 2025 - Tech Forum 2025
BookNet Canada
 
PDF
The 2025 InfraRed Report - Redpoint Ventures
Razin Mustafiz
 
PDF
Newgen 2022-Forrester Newgen TEI_13 05 2022-The-Total-Economic-Impact-Newgen-...
darshakparmar
 
PDF
NASA A Researcher’s Guide to International Space Station : Physical Sciences ...
Dr. PANKAJ DHUSSA
 
PDF
Kit-Works Team Study_20250627_한달만에만든사내서비스키링(양다윗).pdf
Wonjun Hwang
 
PDF
Newgen Beyond Frankenstein_Build vs Buy_Digital_version.pdf
darshakparmar
 
PPTX
New ThousandEyes Product Innovations: Cisco Live June 2025
ThousandEyes
 
[Newgen] NewgenONE Marvin Brochure 1.pdf
darshakparmar
 
Bitcoin for Millennials podcast with Bram, Power Laws of Bitcoin
Stephen Perrenod
 
Python coding for beginners !! Start now!#
Rajni Bhardwaj Grover
 
Agentforce World Tour Toronto '25 - MCP with MuleSoft
Alexandra N. Martinez
 
Peak of Data & AI Encore AI-Enhanced Workflows for the Real World
Safe Software
 
AI Penetration Testing Essentials: A Cybersecurity Guide for 2025
defencerabbit Team
 
What’s my job again? Slides from Mark Simos talk at 2025 Tampa BSides
Mark Simos
 
COMPARISON OF RASTER ANALYSIS TOOLS OF QGIS AND ARCGIS
Sharanya Sarkar
 
MuleSoft MCP Support (Model Context Protocol) and Use Case Demo
shyamraj55
 
The Project Compass - GDG on Campus MSIT
dscmsitkol
 
Agentic AI lifecycle for Enterprise Hyper-Automation
Debmalya Biswas
 
Seamless Tech Experiences Showcasing Cross-Platform App Design.pptx
presentifyai
 
Q2 FY26 Tableau User Group Leader Quarterly Call
lward7
 
Book industry state of the nation 2025 - Tech Forum 2025
BookNet Canada
 
The 2025 InfraRed Report - Redpoint Ventures
Razin Mustafiz
 
Newgen 2022-Forrester Newgen TEI_13 05 2022-The-Total-Economic-Impact-Newgen-...
darshakparmar
 
NASA A Researcher’s Guide to International Space Station : Physical Sciences ...
Dr. PANKAJ DHUSSA
 
Kit-Works Team Study_20250627_한달만에만든사내서비스키링(양다윗).pdf
Wonjun Hwang
 
Newgen Beyond Frankenstein_Build vs Buy_Digital_version.pdf
darshakparmar
 
New ThousandEyes Product Innovations: Cisco Live June 2025
ThousandEyes
 

Mysql performance tuning

  • 1. MYSQL Performance Tuning Philip Zhong © 2011 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 1
  • 2. © 2011 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 2
  • 3. © 2011 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 3
  • 4. • Types of I/O schedulers (introduced in 2.6.10: RHEL5) – noop: Sorting incoming i/o requests by logical block address, that’s all – deadlilne: Prioritize read (sync) requests rather than write requests (async) to some extent (to avoid “write-starving-reads” problem) – cfq(default): Fairly scheduling i/o requests per i/o thread – anticipatory: Removed in 2.6.33 (bad scheduler. Don’t use it) • Default is cfq, but noop / deadline is better in many cases # echo noop > /sys/block/sdX/queue/scheduler © 2011 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 4
  • 5. © 2011 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 5
  • 6. • Make sure to use Flash Back Write Cache(FBWC) or Battery Backed up Write Cache (BBWC) on raid cards – 10,000+ fsync() per second, without BBWC less than 200 on HDD – Disable write cache on disks for safety reasons • Do not set “write barrier” on file systems (enabled by default in some cases) – Write-through to disks even though BBWC is enabled (very slow) – ext3: mount -o barrier=0 – xfs: mount -o nobarrier – drbd: no-disk-barrier in drbd.conf • Consider disabling atime updates on files and directories – mount -o noatime,nodiratime © 2011 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 6
  • 7. © 2011 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 7
  • 8. • Put sequentially written files on HDD – ibdata, ib_logfile, binary log files – HDD is fast enough for sequential writes – Write performance deterioration can be mitigated – Life expectancy of SSD will be longer • Put randomly accessed files on SSD – *ibd files, index files(MYI), data files(MYD) – SSD is 10x -100x faster for random reads than HDD – Archive less active tables/records to HDD – SSD is still much expensive than HDD • Use 5.1 InnoDB Plugin – Higher scalability & concurrency matters on faster storage © 2011 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 8
  • 9. © 2011 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 9
  • 10. • RAM access speed is much faster than HDD/SSD RAM: -60ns HDD: -5ms SSD: 100-500us • 16-100+GB RAM is now pretty common • Sharding hot application data is important • Do not create unnecessary indexes • Delete records or move to archived tables, to keep hot tables smaller © 2011 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 10
  • 11. • Network • CPU • …… © 2011 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 11
  • 12. © 2011 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 12
  • 13. © 2011 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 13
  • 14. © 2011 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 14
  • 15. © 2011 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 15
  • 16. • EXPLAIN SELECT – Tells you what MYSQL is thinking – Which keys (indexes) can it use – Which keys will it use – How many rows must it examine (roughly) © 2011 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 16
  • 17. • EXPLAIN doesn’t tell you anything about how triggers, stored functions, or UDFs will affect your query. • It doesn’t work for stored procedures, although you can extract the queries manually and EXPLAIN them individually. • Some of the statistics it shows are estimates and can be very inaccurate. • It doesn’t show you everything there is to know about a query’s execution plan. • It doesn’t distinguish between some things with the same name. For example, it uses “file sort” for in-memory sorts and for temporary files, and it displays “Using temporary” for temporary tables on disk and in memory. It can be misleading © 2011 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 17
  • 18. © 2011 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 18
  • 19. © 2011 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 19
  • 20. SIMPLE:Simple SELECT (not using UNION or subqueries) PRIMARY: outer select. UNION:Second or later SELECT statement in a UNION DEPENDENT UNION:Second or later SELECT statement in a UNION, dependent on outer query UNION RESULT:Result of a UNION. SUBQUERY:First SELECT in subquery DEPENDENT SUBQUERY: First SELECT in subquery, dependent on outer query DERIVED:Derived table SELECT (subquery in FROM clause) UNCACHEABLE SUBQUERY:A subquery for which the result cannot be cached and must be reevaluated for each row of the outer query UNCACHEABLE UNION:The second or later select in a UNION that belongs to an uncacheable subquery (see UNCACHEABLE SUBQUERY) © 2011 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 20
  • 21. © 2011 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 21
  • 22. © 2011 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 22
  • 23. • Mysql>SET profiling = 1; • Mysql>SHOW PROFILES; • Mysql>SHOW PROFILE CPU FOR QUERY 1; © 2011 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 23
  • 24. © 2011 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 24
  • 25. select tag_id,tag_text from Tags where tag_id+1>1 and tag_id+1<11; © 2011 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 25
  • 26. © 2011 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 26
  • 27. © 2011 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 27
  • 28. © 2011 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 28
  • 29. © 2011 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 29
  • 30. © 2011 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 30
  • 31. Thank you. © 2011 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 31

Editor's Notes

  • #5: 1)AnticipatoryThe anticipatory I/O elevator was created based on the assumption of a block device withonly one physical seek head (for example a single SATA drive). The anticipatory elevatoruses the deadline mechanism described in more detail below plus an anticipationheuristic. As the name suggests, the anticipatory I/O elevator “anticipates” I/O andattempts to write it in single, bigger streams to the disk instead of multiple very smallrandom disk accesses. The anticipation heuristic may cause latency for write I/O. It isclearly tuned for high throughput on general purpose systems such as the averagepersonal computer. Up to kernel release 2.6.18 the anticipatory elevator is the standardI/O scheduler. However most Enterprise Linux distributions default to the CFQ elevator.2)Complete Fair Queuing (CFQ)The CFQ elevator implements a QoS (Quality of Service) policy for processes bymaintaining per-process I/O queues. The CFQ elevator is well suited for large multiusersystems with a lot of competing processes. It aggressively attempts to avoid starvation ofprocesses and features low latency. Starting with kernel release 2.6.18 the improved CFQelevator is the default I/O scheduler.Depending on the system setup and the workload characteristics, the CFQ scheduler canslowdown a single main application, for example a massive database with its fairnessoriented algorithms. The default configuration handles the fairness based on processgroups which compete against each other. For example a single database and all writesthrough the page cache (all pdflush instances are in one pgroup) are considered as asingle application by CFQ that could compete against many background processes. It canbe useful to experiment with I/O scheduler subconfigurations and/or the deadlinescheduler in such cases.3)DeadlineThe deadline elevator is a cyclic elevator (round robin) with a deadline algorithm thatprovides a near real-time behavior of the I/O subsystem. The deadline elevator offersexcellent request latency while maintaining good disk throughput. The implementation ofthe deadline algorithm ensures that starvation of a process cannot occur.4)NOOPNOOP stands for No Operation, and the name explains most of its functionality. TheNOOP elevator is simple and lean. It is a simple FIFO queue that does not perform anydata ordering. NOOP simply merges adjacent data requests, so it adds very low processoroverhead to disk I/O. The NOOP elevator assumes that a block device either features itsown elevator algorithm such as TCQ for SCSI, or that the block device has no seek latencysuch as a flash card.
  • #7: A write barrier is a kernel mechanism used to ensure that file system metadata is correctly written and ordered on persistent storage, even when storage devices with volatile write caches lose power.The disk contains all the data. No re-ordering has occurred. With barriers enabled, an fsync() call will also issue a storage cache flushWhen a write hits in a cache, two possible policy choices exist. First, the data can be writtenboth into the cache and passed on to the next lower level in the memory hierarchy. This policy iscalled write-through. A second possible policy on write hits is to only write the data to thefirst-level cache. Only when a dirty line (i.e., a line that has been written to) is replaced in thecache is the data transferred to a lower level in the memory hierarchy. This policy is calledwrite-back.