SlideShare a Scribd company logo
High Performance MySQL 高性能 MySQL 苏业栋
High Performance MySQL Finding Bottlenecks: Benchmarking and Profiling Schema Optimization and Indexing MySQL Architecture
MySQL Architecture
Connection Management and Security 1. One thread per connection 2. Server Caches threads 3. Authentication based on user, pass, SSL 4. Verify on every query
Optimization and Execution 1.Rewrite query 2.Determine which table to read first 3.Choose which index to use 4.Statistic the storage engines before make optimization
Concurrency Control Read/Write Locks Lock Granularity Table locks  MyISAM Row locks  INNODB
Transaction Atomicity Consistency Isolation  Durability
Isolation Levels READ UNCOMMITTED READ COMMITTED REPEATABLE READ SERIALIZABLE
ISOLATION LEVEL
Deadlocks Deadlock  is when two or more transactions are mutually holding and requesting locks on the same resources, creating a cycle of dependencies .  Deadlocks occur when transactions try to lock resources in a different order.
Transaction Logging Transaction logging helps make transactions more efficient. Updating the log first and at some later time, update disk. Sequence I/O in a small area .vs. Random I/O in many places If there’s a crash after the update is written to the transaction log but before the changes are made to the data itself, the storage engine can still recover the changes upon restart.
Transactions in MySQL AUTOCOMMIT If DDL execute, then commit But No effect on MyISAM or Memory tables
Mixing storage engines in transactions Store engines  manage the transactions not server. If mix transactional and nontransactional tables, rollback will be partially
Implicit and explicit locking Implicit locking Acquire lock during transaction and not release until COMMIT or ROLLBACK Explicit locking SELECT … LOCK IN SHARE MODE SELECT … FOR UPDATE
MySQL’s Storage Engines MyISAM InnoDB Memory Archive CSV Federated Blackhole NDB Falcon solidDB PBXT(Primebase XT) Maria Storage
MyISAM Engine Locking and concurrency Automatic repair Manual repair Index feature Delayed key writes
InnoDB Engine Transaction Tablespace MVCC All 4 isolation level Clustered index
Selecting the Right Engine Important
Consideration Transactions  InnoDB: most stable,well-integrated,proven choice when write MyISAM: no transactions, primarily either SELECT or INSERT Concurrency Just insert and read cuncurrently, MyISAM mixture operations without interfering with each other,row-level locking Backups shut down for backups or online backups
Consideration Crash recovery If you have a lot of data, you should seriously consider how long it will take to recover from a crash. MyISAM tables generally become corrupt more easily and take much longer to recover than InnoDB tables, for example. In fact, this is one of the most important reasons why a lot of people use InnoDB when they don’t need transactions. Special feature Full text index?  MYISAM Clustered index optimization?  InnoDB or solidDB
Practical Examples logging Read-only or read-mostly tables MyISAM is faster than InnoDB but not always Order processing transactions required foreign key? InnoDB Stock quotes MyISAM
Storage Engine Summary
Table Conversions ALTER TABLE Dump and import Create and select
Finding Bottlenecks: Benchmarking and Profiling
Finding Bottlenecks: Benchmarking and Profiling At some point, you’re bound to need more performance from MySQL. But what should you try to improve?  A particular query?  Your schema?  Your hardware?  The only way to know is to measure what your system is doing, and test its performance under various conditions.
We want to find What prevents better performance ? What will prevent better performance in the future.
Benchmarking and Profiling Benchmarking answers the question “How well does this perform?”  Profiling answers the question  “Why does it perform the way it does?”
Benchmark can help you Measure how your application currently performs.  Validate your system’s scalability. Plan for growth. Test your application’s ability to tolerate a changing environment.  Test different hardware, software, and operating system configurations.
Benchmarking Strategies The application as a whole Isolate MySQL ?
Application as a whole instead of just MySQL You don’t care about MySQL’s performance in particular; you care about the whole application. MySQL is not always the application bottleneck Only by testing the full application can you see how each part’s cache behaves. Benchmarks are good only to the extent that they reflect your actual application’s behavior, which is hard to do when you’re testing only part of it.
Just need a MySQL benchmark You want to compare different schemas or queries. You want to benchmark a specific problem you see in the application. You want to avoid a long benchmark in favor of a shorter one that gives you a faster “cycle time” for making and measuring changes.
What to Measure You need to identify your goals before you start benchmarking—indeed, before you even design your benchmarks.  Your goals will determine the tools and techniques you’ll use to get accurate, meaningful results.
Consideration Transactions per time unit Response time or latency Scalability Concurrency
Bad bemchmarks(1) Using a subset of the real data size Using incorrectly distributed data Using unrealistically distributed parameters, such as pretending that all user profiles are equally likely to be viewed. Using a single-user scenario for a multiuser application Benchmarking a distributed application on a single server
Bad bemchmarks(2) Failing to match real user behavior Running identical queries in a loop Failing to check for errors. Ignoring how the system performs when it’s not warmed up, such as right after a restart Using default server settings
Design and Planning a Benchmark Identify the problem and the goal Whether to use a standard benchmark or design your own Queries against the data
Benchmarking Tools Full-Stack Tools:  such as ab, http_load, JMeter Single-Component Tools :  such as mysqlslap, sysbench, Database Test Suite,  sql-bench
Profiling Profiling shows you how much each part of a system contributes to the total cost of producing a result The simplest cost metric is time, but profiling can also measure the number of function calls,  I/O operations, database queries. The goal is to understand why a system performs the way it does.
Take PHP web site for example Database access is often, but not always, the bottleneck in applications. May be… External resources, such as calls to web services or search engines Operations that require processing large amounts of data in the application, such as parsing big XML files Expensive operations in tight loops, such as abusing regular expressions Badly optimized algorithms, such as naive search algorithms to find items in lists
How and what to measure We recommend that you include profiling code in  every  new project you start. It might be hard to inject profiling code into an existing application, but it’s easy to include it in new applications. What should we do in the profiling code?
Your profiling code should gather and log at least the following Total execution time, or “wall-clock time” (in web applications, this is the total page render time) Each query executed, and its execution time Each connection opened to the MySQL server Every call to an external resource, such as web services,  memcached , and externally invoked scripts Potentially expensive function calls, such as XML parsing User and system CPU time If you act like the mentioned, you may lucky to find something you might not capture otherwise, like: Overall performance problems Sporadically increased response times System bottlenecks, which might not be MySQL Execution time of “invisible” users, such as search engine spiders
Will Profiling Slow Your Servers? <?php $profiling_enabled = rand(0, 100) > 99; ?> Profiling just 1% of your page views should help you find the worst problems.
MySQL Profiling The goal is to find out where MySQL spends most of its time. The kinds of information you can glean include: Which data MySQL accesses most What kinds of queries MySQL executes most What states MySQL threads spend the most time in What subsystems MySQL uses most to execute a query What kinds of data accesses MySQL does during a query How much of various kinds of activities, such as index scans, MySQL does
Logging queries The general log captures all queries, as well as some nonquery events such as connecting and disconnecting. You can enable it with a single configuration directive: log = <file_name> By design, the general log does not contain execution times or any other information that’s available only after a query finishes. Slow log capture all queries that take more than two seconds to execute, and log queries that don’t use any indexes. It will also log slow administrative statements, such as OPTIMIZE TABLE: log-slow-queries = <file_name> long_query_time = 2 log-queries-not-using-indexes log-slow-admin-statements
How to read the slow query log 1 # Time: 071031 20:03:16 2 # User@Host: root[root] @ localhost [] 3 # Thread_id: 4 4 # Query_time: 0.503016 Lock_time: 0.000048 Rows_sent: 56 Rows_examined: 1113 5 # QC_Hit: No Full_scan: No Full_join: No Tmp_table: Yes Disk_tmp_table: No 6 # Filesort: Yes Disk_filesort: No Merge_passes: 0 7 # InnoDB_IO_r_ops: 19 InnoDB_IO_r_bytes: 311296 InnoDB_IO_r_wait: 0.382176 8 # InnoDB_rec_lock_wait: 0.000000 InnoDB_queue_wait: 0.067538 9 # InnoDB_pages_distinct: 20 10 SELECT ...
Try the slow query again? You may find a slow query, run it yourself, and find that it executes in a fraction of a second. Appearing in the log simply means the query took a long time then; it doesn’t mean it will take a long time now or in the future. There are many reasons why a query can be slow sometimes and fast at other times: A table may have been locked, causing the query to wait. The lock time indicates how long the query waited for locks to be released. The data or indexes may not have been cached in memory yet. This is common when MySQL is first started or hasn’t been well tuned. A nightly backup process may have been running, making all disk I/O slower. The server may have been running other queries at the same time, slowing down this query.
Log analysis tool mysqldumpslow mysql_slow_log_filter mysql_slow_log_parser mysqlsla
Profiling a MySQL Server SHOW STATUS  ; mysqladmin extended -r -i 10
Profiling a MySQL Server Bytes_received and Bytes_sent The traffic to and from the server Com_* The commands the server is executing Created_* Temporary tables and files created during query execution Handler_* Storage engine operations Select_* Various types of join execution plans Sort_* Several types of sort information
SHOW PROFILE mysql>  SET profiling = 1; Now let’s run a query: mysql>  SELECT COUNT(DISTINCT actor.first_name) AS cnt_name, COUNT(*) AS cnt ->  FROM sakila.film_actor ->  INNER JOIN sakila.actor USING(actor_id) ->  GROUP BY sakila.film_actor.film_id ->  ORDER BY cnt_name DESC; ... 997 rows in set (0.03 sec)
SHOW PROFILE mysql>  SHOW PROFILE; +------------------------+-----------+ | Status | Duration | +------------------------+-----------+ | (initialization) | 0.000005 | | Opening tables | 0.000033 | | System lock | 0.000037 | | Table lock | 0.000024 | | init | 0.000079 | | optimizing | 0.000024 | | statistics | 0.000079 | | preparing | 0.00003 | | Creating tmp table | 0.000124 | | executing | 0.000008 | | Copying to tmp table | 0.010048 | | Creating sort index | 0.004769 | | Copying to group table | 0.0084880 | | Sorting result | 0.001136 | | Sending data | 0.000925 | | end | 0.00001 | | removing tmp table | 0.00004 | | end | 0.000005 | | removing tmp table | 0.00001 | | end | 0.000011 | | query end | 0.00001 | | freeing items | 0.000025 | | removing tmp table | 0.00001 | | freeing items | 0.000016 | | closing tables | 0.000017 | | logging slow query | 0.000006 | +------------------------+-----------+
Schema Optimization and Indexing Choosing Optimal Data Types Indexing Basics Indexing Strategies for High Performance Index and Table Maintenance Normalization and Denormalization Speeding Up ALTER TABLE Notes on Storage Engines
Choosing Optimal Data Type Some guidelines Smaller is usually better Simple is good Avoid NULL if possible It’s harder for MySQL to optimize queries that refer to nullable columns,because they make indexes, index statistics, and value comparisons more complicated.
Whole Numbers MySQL lets you specify a “width” for integer types, such as INT(11). This is meaningless for most applications: it does not restrict the legal range of values, but simply specifies the number of characters MySQL’s interactive tools (such as the command-line client) will reserve for display purposes. For storage and computational purposes, INT(1) is identical to INT(20).
Real Numbers MySQL uses DOUBLE for its internal calculations on floating point types. Because of the additional space requirements and computational cost, you should use DECIMAL only when you need exact results for fractional numbers—for example, when storing financial data.
String Types VARCHAR and CHAR types BLOB and TEXT types Using ENUM instead of a string type
String Types
Date and Time Types DATETIME stores the date and time packed into an integer in YYYYMMDDHHMMSS format, independent of time zone, eight bytes of storage space TIMESTAMP type stores the number of seconds elapsed since midnight,depends on the time zone,only four bytes of storage
Bit-Packed Data types MySQL has a few storage types that use individual bits within a value to store data compactly. All of these types are technically string types, regardless of the underlying storage format and manipulations.
Choosing Identifiers When choosing a type for an identifier column, you need to consider not only the storage type, but also how MySQL performs computations and comparisons on that type.  Once you choose a type, make sure you use the same type in all related tables. The types should match exactly, including properties such as UNSIGNED.* Mixing different data types can cause performance problems, and even if it doesn’t, implicit type conversions during comparisons can create hard-to-find errors. These may even crop up much later, after you’ve forgotten that you’re comparing different data types.
Special Types of Data IP address is really an unsigned 32-bit integer, not a string. MySQL provides the INET_ATON( ) and INET_NTOA( ) functions to convert between the two representations.
Index Basics Indexes are implemented in the storage engine layer, not the server layer. Thus, they are not standardized: indexing works slightly differently in each engine, and not all engines support all types of indexes.
Index Basics Types of Indexes B-Tree indexes Hash indexes Spatial(R-Tree) indexes Full-text indexes
Index Basics
B-Tree Index CREATE TABLE People ( last_name  varchar(50)  not null, first_name varchar(50)  not null, dob  date  not null, gender  enum('m', 'f') not null, key(last_name, first_name, dob) );
B-Tree Index Types of Indexes B-Tree indexes Hash indexes Spatial(R-Tree) indexes Full-text indexes
Types of Indexes B-Tree indexes Hash indexes Spatial(R-Tree) indexes Full-text indexes
B-Tree Index Types of queries that can use B-Tree index Match the full value A match on the full key value specifies values for all columns in the index. For example, this index can help you find a person named Cuba Allen who was born on 1960-01-01. Match a leftmost prefix This index can help you find all people with the last name Allen. This uses only the first column in the index. Match a column prefix You can match on the first part of a column’s value. This index can help you find all people whose last names begin with J. This uses only the first column in the index.
Types of queries that can use a B-Tree index Match a range of values This index can help you find people whose last names are between Allen and Barrymore. This also uses only the first column. Match one part exactly and match a range on another part  This index can help you find everyone whose last name is Allen and whose first name starts with the letter K (Kim, Karl, etc.). This is an exact match on last_name and a range query on first_name. Index-only queries B-Tree indexes can normally support index-only queries, which are queries that access only the index, not the row storage.
limitations of B-Tree indexes: They are not useful if the lookup does not start from the leftmost side of the indexed columns. This index won’t help you find all people named Bill or all people born on a certain date, because those columns are not leftmost in the index.  You can’t skip columns in the index.   That is, you won’t be able to find all people whose last name is Smith and who were born on a particular date.  The storage engine can’t optimize accesses with any columns to the right of the first range condition.   If your query is WHERE last_name=&quot;Smith&quot; AND first_name LIKE 'J%' AND dob='1976-12-23', the index access will use only the first two columns in the index, because the LIKE is a range condition.
Hash indexes A hash index is built on a hash table and is useful only for exact lookups that use every column in the index.* For each row, the storage engine computes a hash code of the indexed columns, which is a small value that will probably differ from the hash codes computed for other rows with different key values. It stores the hash codes in the index and stores a pointer to each row in a hash table. CREATE TABLE testhash ( fname VARCHAR(50) NOT NULL, lname VARCHAR(50) NOT NULL, KEY USING HASH(fname) ) ENGINE=MEMORY;
Hash indexes
Hash indexes Now suppose the index uses an imaginary hash function called f( ), which returns the following values (these are just examples, not real values): f('Arjen') = 2323 f('Baron') = 7437 f('Peter') = 8784 f('Vadim') = 2458
Hash indexes mysql> SELECT lname FROM testhash WHERE fname='Peter';
Hash indexes have some limitations Because the index contains only hash codes and row pointers rather than the values themselves, MySQL can’t use the values in the index to avoid reading the rows. MySQL can’t use hash indexes for sorting because they don’t store rows in sorted order. Hash indexes don’t support partial key matching, because they compute the hash from the entire indexed value.  Hash indexes support only equality comparisons that use the =, IN( ) operators. They can’t speed up range queries, such as WHERE price > 100. Accessing data in a hash index is very quick, unless there are many collisions (multiple values with the same hash). When there are collisions, the storage engine must follow each row pointer in the linked list and compare their values to the lookup value to find the right row(s). Some index maintenance operations can be slow if there are many hash collisions.
SPATIAL INDEXES MyISAM supports spatial indexes, which you can use with geospatial types such as GEOMETRY. Unlike B-Tree indexes, spatial indexes don’t require your WHERE clauses to operate on a leftmost prefix of the index. They index the data by all dimensions at the same time. As a result, lookups can use any combination of dimensions efficiently. However, you must use the MySQL GIS functions, such as MBRCONTAINS( ),for this to work.
FULLTEXT INDEX FULLTEXT is a special type of index for MyISAM tables. It finds keywords in the text instead of comparing values directly to the values in the index. Full-text searching is completely different from other types of matching. It has many subtleties, such as stopwords, stemming and plurals, and Boolean searching. It is much more analogous to what a search engine does than to simple WHERE parameter matching. Having a full-text index on a column does not eliminate the value of a B-Tree index on the same column. Full-text indexes are for MATCH AGAINST operations, not ordinary WHERE clause operations.
Hash indexes have some limitations Because the index contains only hash codes and row pointers rather than the values themselves, MySQL can’t use the values in the index to avoid reading the rows. MySQL can’t use hash indexes for sorting because they don’t store rows in sorted order. Hash indexes don’t support partial key matching, because they compute the hash from the entire indexed value.  Hash indexes support only equality comparisons that use the =, IN( ) operators. They can’t speed up range queries, such as WHERE price > 100. Accessing data in a hash index is very quick, unless there are many collisions (multiple values with the same hash). When there are collisions, the storage engine must follow each row pointer in the linked list and compare their values to the lookup value to find the right row(s). Some index maintenance operations can be slow if there are many hash collisions.
Indexing Strategies for High Performance Isolate the Column Prefix Indexes and Index Selectivity Clustered Indexes Comparison of InnoDB and MyISAM data layout Inserting rows in primary key order with InnoDB Covering Index Using Index Scans for Sorts Packed(Prefix-compressed) Indexes Redundant and Duplicate Indexes Indexes and Locking
Indexing Strategies for High Performance Isolate the Column If you don’t isolate the indexed columns in a query, MySQL generally can’t use indexes on columns unless the columns are isolated in the query. “Isolating” the column means it should not be part of an expression or be inside a function in the query. Which is the best one? mysql> SELECT actor_id FROM sakila.actor WHERE actor_id + 1 = 5; mysql> SELECT ... WHERE TO_DAYS(CURRENT_DATE) - TO_DAYS(date_col) <= 10; mysql> SELECT ... WHERE date_col >= DATE_SUB(CURRENT_DATE, INTERVAL 10 DAY); mysql> SELECT ... WHERE date_col >= DATE_SUB('2008-01-17', INTERVAL 10 DAY);
Prefix Indexes and Index Selectivity Sometimes you need to index very long character columns, which makes your indexes large and slow. One strategy is to simulate a hash index, as we showed earlier in this chapter. But sometimes that isn’t good enough. What can you do? A prefix of the column is often selective enough to give good performance. If you’re indexing BLOB or TEXT columns, or very long VARCHAR columns, you must define prefix indexes, because MySQL disallows indexing their full length.
Cluster Indexes Clustered indexes aren’t a separate type of index. Rather, they’re an approach to data storage. The exact details vary between implementations, but InnoDB’s clustered indexes actually store a B-Tree index and the rows together in the same structure. When a table has a clustered index, its rows are actually stored in the index’s leaf pages. The term “clustered” refers to the fact that rows with adjacent key values are stored close to each other.† You can have only one clustered index per table, because you can’t store the rows in two places at once. (However, covering indexes let you emulate multiple clustered indexes; more on this later.)
Covering Index Indexes are a way to find rows efficiently, but MySQL can also use an index to retrieve a column’s data, so it doesn’t have to read the row at all. After all, the index’s leaf nodes contain the values they index; why read the row when reading the index can give you the data you want? An index that contains (or “covers”) all the data needed to satisfy a query is called a covering index.
Using Index Scans for Sorts MySQL has two ways to produce ordered results: it can use a filesort, or it can scan an index in order.* You can tell when MySQL plans to scan an index by looking for “index” in the type column in EXPLAIN. (Don’t confuse this with “Using index” in the Extra column.) Ordering the results by the index works only when the index’s order is exactly the same as the ORDER BY clause and all columns are sorted in the same direction (ascending or descending). If the query joins multiple tables, it works only when all columns in the ORDER BY clause refer to the first table. The ORDER BY clause also has the same limitation as lookup queries: it needs to form a leftmost prefix of the index. In all other cases, MySQL uses a filesort.
Packed (Prefix-Compressed) Indexes MyISAM uses prefix compression to reduce index size, allowing more of the index to fit in memory and dramatically improving performance in some cases. It packs string values by default, but you can even tell it to compress integer values. You can control how a table’s indexes are packed with the PACK_KEYS option to CREATE TABLE.
Redundant and Duplicate Indexes CREATE TABLE test ( ID INT NOT NULL PRIMARY KEY, UNIQUE(ID), INDEX(ID) );
Indexes and Locking Indexes play a very important role for InnoDB, because they let queries lock fewer rows. This is an important consideration, because in MySQL 5.0 InnoDB never unlocks a row until the transaction commits. If your queries never touch rows they don’t need, they’ll lock fewer rows, and that’s better for performance for two reasons. First, even though InnoDB’s row locks are very efficient and use very little memory, there’s still some overhead involved in row locking. Secondly, locking more rows than needed increases lock contention and reduces concurrency.
Summary of index strategies Now that you’ve learned more about indexing, perhaps you’re wondering where to get started with your own tables. The most important thing to do is examine the queries you’re going to run most often, but you should also think about less-frequent operations,such as inserting and updating data. Try to avoid the common mistake of creating indexes without knowing which queries will use them, and consider whether all your indexes together will form an optimal configuration. Sometimes you can just look at your queries, and see which indexes they need, add them, and you’re done. But sometimes you’ll have enough different kinds of queries that you can’t add perfect indexes for them all, and you’ll need to compromise. To find the best balance, you should benchmark and profile. The first thing to look at is response time. Consider adding an index for any query that’s taking too long. Then examine the queries that cause the most load , and add indexes to support them. If your system is approaching a memory, CPU, or disk bottleneck, take that into account. For example, if you do a lot of long aggregate queries to generate summaries, your disks might benefit from covering indexes that support GROUP BY queries. Where possible, try to extend existing indexes rather than adding new ones. It is usually more efficient to maintain one multicolumn index than several single-column indexes. If you don’t yet know your query distribution, strive to make your indexes as selective as you can, because highly selective indexes are usually more beneficial.
Index and Table Maintenance Finding and Repairing Table Corruption Updating Index Statistics Reducing Index and Data Fragmentation
Normalization and Denormalization Pros and Cons of a Normalized Schema Pros and Cons of a Denormalized Schema A Mixture of Normalized and Denormalized Cache and Summary Tables Counter tables
Speeding Up ALTER TABLE Modifying Only the .frm File Building MyISAM Indexes Quickly
Notes on Storage Engines The MyISAM Storage Engine Table locks No automated data recovery No transactions Only indexes are cached in memory Compact storage
The Memory Storage Engine Table locks Like MyISAM tables, Memory tables have table locks. This isn’t usually a problem though, because queries on Memory tables are normally fast. No dynamic rows Memory tables don’t support dynamic (i.e., variable-length) rows, so they don’t support BLOB and TEXT fields at all. Even a VARCHAR(5000) turns into a CHAR(5000)—a huge memory waste if most values are small. Hash indexes are the default index type Unlike for other storage engines, the default index type is hash if you don’t specifyit explicitly No index statistics Memory tables don’t support index statistics, so you may get bad execution plans for some complex queries. Content is lost on restart Memory tables don’t persist any data to disk, so the data is lost when the server restarts, even though the tables’ definitions remain.
Thanks!

More Related Content

PDF
MariaDB 마이그레이션 - 네오클로바
NeoClova
 
PDF
MariaDB 제품 소개
NeoClova
 
PDF
Differences between MariaDB 10.3 & MySQL 8.0
Colin Charles
 
PPTX
Basics of MongoDB
HabileLabs
 
DOCX
MySQL_SQL_Tunning_v0.1.3.docx
NeoClova
 
PDF
The NFS Version 4 Protocol
Kelum Senanayake
 
DOCX
Keepalived+MaxScale+MariaDB_운영매뉴얼_1.0.docx
NeoClova
 
PDF
ProxySQL High Availability (Clustering)
Mydbops
 
MariaDB 마이그레이션 - 네오클로바
NeoClova
 
MariaDB 제품 소개
NeoClova
 
Differences between MariaDB 10.3 & MySQL 8.0
Colin Charles
 
Basics of MongoDB
HabileLabs
 
MySQL_SQL_Tunning_v0.1.3.docx
NeoClova
 
The NFS Version 4 Protocol
Kelum Senanayake
 
Keepalived+MaxScale+MariaDB_운영매뉴얼_1.0.docx
NeoClova
 
ProxySQL High Availability (Clustering)
Mydbops
 

What's hot (20)

PDF
Maxscale_메뉴얼
NeoClova
 
PDF
Azure Redis Cache
Chourouk HJAIEJ
 
PDF
[2018] MySQL 이중화 진화기
NHN FORWARD
 
PPTX
Postgresql
NexThoughts Technologies
 
PDF
MariaDB 10.11 key features overview for DBAs
Federico Razzoli
 
PPTX
Mysql replication
ThreeSnakes
 
PDF
Oracle Latch and Mutex Contention Troubleshooting
Tanel Poder
 
PDF
How to Design Indexes, Really
Karwin Software Solutions LLC
 
PDF
Mastering PostgreSQL Administration
EDB
 
PDF
Aurora MySQL Backtrack을 이용한 빠른 복구 방법 - 진교선 :: AWS Database Modernization Day 온라인
Amazon Web Services Korea
 
PDF
MariaDB MaxScale monitor 매뉴얼
NeoClova
 
PPTX
PostgreSQL Security. How Do We Think?
Ohyama Masanori
 
PDF
Using all of the high availability options in MariaDB
MariaDB plc
 
PPTX
Running MariaDB in multiple data centers
MariaDB plc
 
PDF
Federated Engine 실무적용사례
I Goo Lee
 
PPTX
Introduction to Kafka and Zookeeper
Rahul Jain
 
PDF
MariaDB Galera Cluster presentation
Francisco Gonçalves
 
PPTX
Maria db 이중화구성_고민하기
NeoClova
 
PPTX
JSON and the Oracle Database
Maria Colgan
 
PPTX
MariaDB Galera Cluster
Abdul Manaf
 
Maxscale_메뉴얼
NeoClova
 
Azure Redis Cache
Chourouk HJAIEJ
 
[2018] MySQL 이중화 진화기
NHN FORWARD
 
MariaDB 10.11 key features overview for DBAs
Federico Razzoli
 
Mysql replication
ThreeSnakes
 
Oracle Latch and Mutex Contention Troubleshooting
Tanel Poder
 
How to Design Indexes, Really
Karwin Software Solutions LLC
 
Mastering PostgreSQL Administration
EDB
 
Aurora MySQL Backtrack을 이용한 빠른 복구 방법 - 진교선 :: AWS Database Modernization Day 온라인
Amazon Web Services Korea
 
MariaDB MaxScale monitor 매뉴얼
NeoClova
 
PostgreSQL Security. How Do We Think?
Ohyama Masanori
 
Using all of the high availability options in MariaDB
MariaDB plc
 
Running MariaDB in multiple data centers
MariaDB plc
 
Federated Engine 실무적용사례
I Goo Lee
 
Introduction to Kafka and Zookeeper
Rahul Jain
 
MariaDB Galera Cluster presentation
Francisco Gonçalves
 
Maria db 이중화구성_고민하기
NeoClova
 
JSON and the Oracle Database
Maria Colgan
 
MariaDB Galera Cluster
Abdul Manaf
 
Ad

Viewers also liked (11)

PDF
Architecture and Design MySQL powered applications by Peter Zaitsev Meetup Sa...
MySQL Brasil
 
PPT
Slides
webhostingguy
 
ODP
Mastering InnoDB Diagnostics
guest8212a5
 
PDF
Optimizing MySQL
Morgan Tocker
 
PDF
MySQL Performance Metrics that Matter
Morgan Tocker
 
PDF
MySQL 5.6 - Operations and Diagnostics Improvements
Morgan Tocker
 
PDF
MySQL For Linux Sysadmins
Morgan Tocker
 
PDF
The InnoDB Storage Engine for MySQL
Morgan Tocker
 
PDF
Introduction to MySQL
Giuseppe Maxia
 
PPT
MySQL Atchitecture and Concepts
Tuyen Vuong
 
PDF
MySQL Server Defaults
Morgan Tocker
 
Architecture and Design MySQL powered applications by Peter Zaitsev Meetup Sa...
MySQL Brasil
 
Mastering InnoDB Diagnostics
guest8212a5
 
Optimizing MySQL
Morgan Tocker
 
MySQL Performance Metrics that Matter
Morgan Tocker
 
MySQL 5.6 - Operations and Diagnostics Improvements
Morgan Tocker
 
MySQL For Linux Sysadmins
Morgan Tocker
 
The InnoDB Storage Engine for MySQL
Morgan Tocker
 
Introduction to MySQL
Giuseppe Maxia
 
MySQL Atchitecture and Concepts
Tuyen Vuong
 
MySQL Server Defaults
Morgan Tocker
 
Ad

Similar to High Performance Mysql (20)

PDF
MySQL 5.6 Performance
MYXPLAIN
 
PPT
MySQL Performance Tuning at COSCUP 2014
Ryusuke Kajiyama
 
PDF
01 demystifying mysq-lfororacledbaanddeveloperv1
Ivan Ma
 
PDF
제3회난공불락 오픈소스 인프라세미나 - MySQL Performance
Tommy Lee
 
PPTX
MySQL Tech Tour 2015 - Manage & Tune
Mark Swarbrick
 
PDF
MySQL Enterprise Monitor
Mario Beck
 
PDF
Highload Perf Tuning
HighLoad2009
 
PPS
MySQL Optimization from a Developer's point of view
Sachin Khosla
 
PDF
High Performance MySQL Optimization Backups Replication and More Second Editi...
samsmadomira
 
PPTX
My sql performance tuning course
Alberto Centanni
 
PDF
MySQL Enterprise Monitor
Mark Swarbrick
 
PPTX
7 Database Mistakes YOU Are Making -- Linuxfest Northwest 2019
Dave Stokes
 
PDF
MySQL Webinar Series 4/4 - Manage & tune
Mark Swarbrick
 
PDF
Full Download High Performance MySQL Optimization Backups Replication and Mor...
qunajfogoe
 
PDF
High Performance MySQL Optimization Backups Replication and More Second Editi...
zcnpkpgkum286
 
PDF
MySQL 和 InnoDB 性能
YUCHENG HU
 
PDF
OSDC 2010 | MySQL and InnoDB Performance - what we know, what we don't by Ba...
NETWAYS
 
PDF
U C2007 My S Q L Performance Cookbook
guestae36d0
 
PPTX
6 Tips to MySQL Performance Tuning
OracleMySQL
 
PPTX
MySQL performance tuning
Anurag Srivastava
 
MySQL 5.6 Performance
MYXPLAIN
 
MySQL Performance Tuning at COSCUP 2014
Ryusuke Kajiyama
 
01 demystifying mysq-lfororacledbaanddeveloperv1
Ivan Ma
 
제3회난공불락 오픈소스 인프라세미나 - MySQL Performance
Tommy Lee
 
MySQL Tech Tour 2015 - Manage & Tune
Mark Swarbrick
 
MySQL Enterprise Monitor
Mario Beck
 
Highload Perf Tuning
HighLoad2009
 
MySQL Optimization from a Developer's point of view
Sachin Khosla
 
High Performance MySQL Optimization Backups Replication and More Second Editi...
samsmadomira
 
My sql performance tuning course
Alberto Centanni
 
MySQL Enterprise Monitor
Mark Swarbrick
 
7 Database Mistakes YOU Are Making -- Linuxfest Northwest 2019
Dave Stokes
 
MySQL Webinar Series 4/4 - Manage & tune
Mark Swarbrick
 
Full Download High Performance MySQL Optimization Backups Replication and Mor...
qunajfogoe
 
High Performance MySQL Optimization Backups Replication and More Second Editi...
zcnpkpgkum286
 
MySQL 和 InnoDB 性能
YUCHENG HU
 
OSDC 2010 | MySQL and InnoDB Performance - what we know, what we don't by Ba...
NETWAYS
 
U C2007 My S Q L Performance Cookbook
guestae36d0
 
6 Tips to MySQL Performance Tuning
OracleMySQL
 
MySQL performance tuning
Anurag Srivastava
 

More from liufabin 66688 (20)

PDF
人大2010新生住宿指南
liufabin 66688
 
PDF
中山大学2010新生住宿指南
liufabin 66688
 
PDF
武汉大学2010新生住宿指南
liufabin 66688
 
PDF
天津大学2010新生住宿指南
liufabin 66688
 
PDF
清华2010新生住宿指南
liufabin 66688
 
PDF
南京大学2010新生住宿指南
liufabin 66688
 
PDF
复旦2010新生住宿指南
liufabin 66688
 
PDF
北京师范大学2010新生住宿指南
liufabin 66688
 
PDF
北大2010新生住宿指南
liufabin 66688
 
PPT
Optimzing mysql
liufabin 66688
 
PDF
Mysql Replication Excerpt 5.1 En
liufabin 66688
 
PDF
Drupal Con My Sql Ha 2008 08 29
liufabin 66688
 
PDF
Application Partitioning Wp
liufabin 66688
 
PDF
090507.New Replication Features(2)
liufabin 66688
 
DOC
Mysql Replication
liufabin 66688
 
PDF
high performance mysql
liufabin 66688
 
PDF
Lecture28
liufabin 66688
 
PDF
Refactoring And Unit Testing
liufabin 66688
 
PDF
Refactoring Simple Example
liufabin 66688
 
PDF
Refactoring Example
liufabin 66688
 
人大2010新生住宿指南
liufabin 66688
 
中山大学2010新生住宿指南
liufabin 66688
 
武汉大学2010新生住宿指南
liufabin 66688
 
天津大学2010新生住宿指南
liufabin 66688
 
清华2010新生住宿指南
liufabin 66688
 
南京大学2010新生住宿指南
liufabin 66688
 
复旦2010新生住宿指南
liufabin 66688
 
北京师范大学2010新生住宿指南
liufabin 66688
 
北大2010新生住宿指南
liufabin 66688
 
Optimzing mysql
liufabin 66688
 
Mysql Replication Excerpt 5.1 En
liufabin 66688
 
Drupal Con My Sql Ha 2008 08 29
liufabin 66688
 
Application Partitioning Wp
liufabin 66688
 
090507.New Replication Features(2)
liufabin 66688
 
Mysql Replication
liufabin 66688
 
high performance mysql
liufabin 66688
 
Lecture28
liufabin 66688
 
Refactoring And Unit Testing
liufabin 66688
 
Refactoring Simple Example
liufabin 66688
 
Refactoring Example
liufabin 66688
 

High Performance Mysql

  • 1. High Performance MySQL 高性能 MySQL 苏业栋
  • 2. High Performance MySQL Finding Bottlenecks: Benchmarking and Profiling Schema Optimization and Indexing MySQL Architecture
  • 4. Connection Management and Security 1. One thread per connection 2. Server Caches threads 3. Authentication based on user, pass, SSL 4. Verify on every query
  • 5. Optimization and Execution 1.Rewrite query 2.Determine which table to read first 3.Choose which index to use 4.Statistic the storage engines before make optimization
  • 6. Concurrency Control Read/Write Locks Lock Granularity Table locks MyISAM Row locks INNODB
  • 7. Transaction Atomicity Consistency Isolation Durability
  • 8. Isolation Levels READ UNCOMMITTED READ COMMITTED REPEATABLE READ SERIALIZABLE
  • 10. Deadlocks Deadlock is when two or more transactions are mutually holding and requesting locks on the same resources, creating a cycle of dependencies . Deadlocks occur when transactions try to lock resources in a different order.
  • 11. Transaction Logging Transaction logging helps make transactions more efficient. Updating the log first and at some later time, update disk. Sequence I/O in a small area .vs. Random I/O in many places If there’s a crash after the update is written to the transaction log but before the changes are made to the data itself, the storage engine can still recover the changes upon restart.
  • 12. Transactions in MySQL AUTOCOMMIT If DDL execute, then commit But No effect on MyISAM or Memory tables
  • 13. Mixing storage engines in transactions Store engines manage the transactions not server. If mix transactional and nontransactional tables, rollback will be partially
  • 14. Implicit and explicit locking Implicit locking Acquire lock during transaction and not release until COMMIT or ROLLBACK Explicit locking SELECT … LOCK IN SHARE MODE SELECT … FOR UPDATE
  • 15. MySQL’s Storage Engines MyISAM InnoDB Memory Archive CSV Federated Blackhole NDB Falcon solidDB PBXT(Primebase XT) Maria Storage
  • 16. MyISAM Engine Locking and concurrency Automatic repair Manual repair Index feature Delayed key writes
  • 17. InnoDB Engine Transaction Tablespace MVCC All 4 isolation level Clustered index
  • 18. Selecting the Right Engine Important
  • 19. Consideration Transactions InnoDB: most stable,well-integrated,proven choice when write MyISAM: no transactions, primarily either SELECT or INSERT Concurrency Just insert and read cuncurrently, MyISAM mixture operations without interfering with each other,row-level locking Backups shut down for backups or online backups
  • 20. Consideration Crash recovery If you have a lot of data, you should seriously consider how long it will take to recover from a crash. MyISAM tables generally become corrupt more easily and take much longer to recover than InnoDB tables, for example. In fact, this is one of the most important reasons why a lot of people use InnoDB when they don’t need transactions. Special feature Full text index? MYISAM Clustered index optimization? InnoDB or solidDB
  • 21. Practical Examples logging Read-only or read-mostly tables MyISAM is faster than InnoDB but not always Order processing transactions required foreign key? InnoDB Stock quotes MyISAM
  • 23. Table Conversions ALTER TABLE Dump and import Create and select
  • 25. Finding Bottlenecks: Benchmarking and Profiling At some point, you’re bound to need more performance from MySQL. But what should you try to improve? A particular query? Your schema? Your hardware? The only way to know is to measure what your system is doing, and test its performance under various conditions.
  • 26. We want to find What prevents better performance ? What will prevent better performance in the future.
  • 27. Benchmarking and Profiling Benchmarking answers the question “How well does this perform?” Profiling answers the question “Why does it perform the way it does?”
  • 28. Benchmark can help you Measure how your application currently performs. Validate your system’s scalability. Plan for growth. Test your application’s ability to tolerate a changing environment. Test different hardware, software, and operating system configurations.
  • 29. Benchmarking Strategies The application as a whole Isolate MySQL ?
  • 30. Application as a whole instead of just MySQL You don’t care about MySQL’s performance in particular; you care about the whole application. MySQL is not always the application bottleneck Only by testing the full application can you see how each part’s cache behaves. Benchmarks are good only to the extent that they reflect your actual application’s behavior, which is hard to do when you’re testing only part of it.
  • 31. Just need a MySQL benchmark You want to compare different schemas or queries. You want to benchmark a specific problem you see in the application. You want to avoid a long benchmark in favor of a shorter one that gives you a faster “cycle time” for making and measuring changes.
  • 32. What to Measure You need to identify your goals before you start benchmarking—indeed, before you even design your benchmarks. Your goals will determine the tools and techniques you’ll use to get accurate, meaningful results.
  • 33. Consideration Transactions per time unit Response time or latency Scalability Concurrency
  • 34. Bad bemchmarks(1) Using a subset of the real data size Using incorrectly distributed data Using unrealistically distributed parameters, such as pretending that all user profiles are equally likely to be viewed. Using a single-user scenario for a multiuser application Benchmarking a distributed application on a single server
  • 35. Bad bemchmarks(2) Failing to match real user behavior Running identical queries in a loop Failing to check for errors. Ignoring how the system performs when it’s not warmed up, such as right after a restart Using default server settings
  • 36. Design and Planning a Benchmark Identify the problem and the goal Whether to use a standard benchmark or design your own Queries against the data
  • 37. Benchmarking Tools Full-Stack Tools: such as ab, http_load, JMeter Single-Component Tools : such as mysqlslap, sysbench, Database Test Suite, sql-bench
  • 38. Profiling Profiling shows you how much each part of a system contributes to the total cost of producing a result The simplest cost metric is time, but profiling can also measure the number of function calls, I/O operations, database queries. The goal is to understand why a system performs the way it does.
  • 39. Take PHP web site for example Database access is often, but not always, the bottleneck in applications. May be… External resources, such as calls to web services or search engines Operations that require processing large amounts of data in the application, such as parsing big XML files Expensive operations in tight loops, such as abusing regular expressions Badly optimized algorithms, such as naive search algorithms to find items in lists
  • 40. How and what to measure We recommend that you include profiling code in every new project you start. It might be hard to inject profiling code into an existing application, but it’s easy to include it in new applications. What should we do in the profiling code?
  • 41. Your profiling code should gather and log at least the following Total execution time, or “wall-clock time” (in web applications, this is the total page render time) Each query executed, and its execution time Each connection opened to the MySQL server Every call to an external resource, such as web services, memcached , and externally invoked scripts Potentially expensive function calls, such as XML parsing User and system CPU time If you act like the mentioned, you may lucky to find something you might not capture otherwise, like: Overall performance problems Sporadically increased response times System bottlenecks, which might not be MySQL Execution time of “invisible” users, such as search engine spiders
  • 42. Will Profiling Slow Your Servers? <?php $profiling_enabled = rand(0, 100) > 99; ?> Profiling just 1% of your page views should help you find the worst problems.
  • 43. MySQL Profiling The goal is to find out where MySQL spends most of its time. The kinds of information you can glean include: Which data MySQL accesses most What kinds of queries MySQL executes most What states MySQL threads spend the most time in What subsystems MySQL uses most to execute a query What kinds of data accesses MySQL does during a query How much of various kinds of activities, such as index scans, MySQL does
  • 44. Logging queries The general log captures all queries, as well as some nonquery events such as connecting and disconnecting. You can enable it with a single configuration directive: log = <file_name> By design, the general log does not contain execution times or any other information that’s available only after a query finishes. Slow log capture all queries that take more than two seconds to execute, and log queries that don’t use any indexes. It will also log slow administrative statements, such as OPTIMIZE TABLE: log-slow-queries = <file_name> long_query_time = 2 log-queries-not-using-indexes log-slow-admin-statements
  • 45. How to read the slow query log 1 # Time: 071031 20:03:16 2 # User@Host: root[root] @ localhost [] 3 # Thread_id: 4 4 # Query_time: 0.503016 Lock_time: 0.000048 Rows_sent: 56 Rows_examined: 1113 5 # QC_Hit: No Full_scan: No Full_join: No Tmp_table: Yes Disk_tmp_table: No 6 # Filesort: Yes Disk_filesort: No Merge_passes: 0 7 # InnoDB_IO_r_ops: 19 InnoDB_IO_r_bytes: 311296 InnoDB_IO_r_wait: 0.382176 8 # InnoDB_rec_lock_wait: 0.000000 InnoDB_queue_wait: 0.067538 9 # InnoDB_pages_distinct: 20 10 SELECT ...
  • 46. Try the slow query again? You may find a slow query, run it yourself, and find that it executes in a fraction of a second. Appearing in the log simply means the query took a long time then; it doesn’t mean it will take a long time now or in the future. There are many reasons why a query can be slow sometimes and fast at other times: A table may have been locked, causing the query to wait. The lock time indicates how long the query waited for locks to be released. The data or indexes may not have been cached in memory yet. This is common when MySQL is first started or hasn’t been well tuned. A nightly backup process may have been running, making all disk I/O slower. The server may have been running other queries at the same time, slowing down this query.
  • 47. Log analysis tool mysqldumpslow mysql_slow_log_filter mysql_slow_log_parser mysqlsla
  • 48. Profiling a MySQL Server SHOW STATUS ; mysqladmin extended -r -i 10
  • 49. Profiling a MySQL Server Bytes_received and Bytes_sent The traffic to and from the server Com_* The commands the server is executing Created_* Temporary tables and files created during query execution Handler_* Storage engine operations Select_* Various types of join execution plans Sort_* Several types of sort information
  • 50. SHOW PROFILE mysql> SET profiling = 1; Now let’s run a query: mysql> SELECT COUNT(DISTINCT actor.first_name) AS cnt_name, COUNT(*) AS cnt -> FROM sakila.film_actor -> INNER JOIN sakila.actor USING(actor_id) -> GROUP BY sakila.film_actor.film_id -> ORDER BY cnt_name DESC; ... 997 rows in set (0.03 sec)
  • 51. SHOW PROFILE mysql> SHOW PROFILE; +------------------------+-----------+ | Status | Duration | +------------------------+-----------+ | (initialization) | 0.000005 | | Opening tables | 0.000033 | | System lock | 0.000037 | | Table lock | 0.000024 | | init | 0.000079 | | optimizing | 0.000024 | | statistics | 0.000079 | | preparing | 0.00003 | | Creating tmp table | 0.000124 | | executing | 0.000008 | | Copying to tmp table | 0.010048 | | Creating sort index | 0.004769 | | Copying to group table | 0.0084880 | | Sorting result | 0.001136 | | Sending data | 0.000925 | | end | 0.00001 | | removing tmp table | 0.00004 | | end | 0.000005 | | removing tmp table | 0.00001 | | end | 0.000011 | | query end | 0.00001 | | freeing items | 0.000025 | | removing tmp table | 0.00001 | | freeing items | 0.000016 | | closing tables | 0.000017 | | logging slow query | 0.000006 | +------------------------+-----------+
  • 52. Schema Optimization and Indexing Choosing Optimal Data Types Indexing Basics Indexing Strategies for High Performance Index and Table Maintenance Normalization and Denormalization Speeding Up ALTER TABLE Notes on Storage Engines
  • 53. Choosing Optimal Data Type Some guidelines Smaller is usually better Simple is good Avoid NULL if possible It’s harder for MySQL to optimize queries that refer to nullable columns,because they make indexes, index statistics, and value comparisons more complicated.
  • 54. Whole Numbers MySQL lets you specify a “width” for integer types, such as INT(11). This is meaningless for most applications: it does not restrict the legal range of values, but simply specifies the number of characters MySQL’s interactive tools (such as the command-line client) will reserve for display purposes. For storage and computational purposes, INT(1) is identical to INT(20).
  • 55. Real Numbers MySQL uses DOUBLE for its internal calculations on floating point types. Because of the additional space requirements and computational cost, you should use DECIMAL only when you need exact results for fractional numbers—for example, when storing financial data.
  • 56. String Types VARCHAR and CHAR types BLOB and TEXT types Using ENUM instead of a string type
  • 58. Date and Time Types DATETIME stores the date and time packed into an integer in YYYYMMDDHHMMSS format, independent of time zone, eight bytes of storage space TIMESTAMP type stores the number of seconds elapsed since midnight,depends on the time zone,only four bytes of storage
  • 59. Bit-Packed Data types MySQL has a few storage types that use individual bits within a value to store data compactly. All of these types are technically string types, regardless of the underlying storage format and manipulations.
  • 60. Choosing Identifiers When choosing a type for an identifier column, you need to consider not only the storage type, but also how MySQL performs computations and comparisons on that type. Once you choose a type, make sure you use the same type in all related tables. The types should match exactly, including properties such as UNSIGNED.* Mixing different data types can cause performance problems, and even if it doesn’t, implicit type conversions during comparisons can create hard-to-find errors. These may even crop up much later, after you’ve forgotten that you’re comparing different data types.
  • 61. Special Types of Data IP address is really an unsigned 32-bit integer, not a string. MySQL provides the INET_ATON( ) and INET_NTOA( ) functions to convert between the two representations.
  • 62. Index Basics Indexes are implemented in the storage engine layer, not the server layer. Thus, they are not standardized: indexing works slightly differently in each engine, and not all engines support all types of indexes.
  • 63. Index Basics Types of Indexes B-Tree indexes Hash indexes Spatial(R-Tree) indexes Full-text indexes
  • 65. B-Tree Index CREATE TABLE People ( last_name varchar(50) not null, first_name varchar(50) not null, dob date not null, gender enum('m', 'f') not null, key(last_name, first_name, dob) );
  • 66. B-Tree Index Types of Indexes B-Tree indexes Hash indexes Spatial(R-Tree) indexes Full-text indexes
  • 67. Types of Indexes B-Tree indexes Hash indexes Spatial(R-Tree) indexes Full-text indexes
  • 68. B-Tree Index Types of queries that can use B-Tree index Match the full value A match on the full key value specifies values for all columns in the index. For example, this index can help you find a person named Cuba Allen who was born on 1960-01-01. Match a leftmost prefix This index can help you find all people with the last name Allen. This uses only the first column in the index. Match a column prefix You can match on the first part of a column’s value. This index can help you find all people whose last names begin with J. This uses only the first column in the index.
  • 69. Types of queries that can use a B-Tree index Match a range of values This index can help you find people whose last names are between Allen and Barrymore. This also uses only the first column. Match one part exactly and match a range on another part This index can help you find everyone whose last name is Allen and whose first name starts with the letter K (Kim, Karl, etc.). This is an exact match on last_name and a range query on first_name. Index-only queries B-Tree indexes can normally support index-only queries, which are queries that access only the index, not the row storage.
  • 70. limitations of B-Tree indexes: They are not useful if the lookup does not start from the leftmost side of the indexed columns. This index won’t help you find all people named Bill or all people born on a certain date, because those columns are not leftmost in the index. You can’t skip columns in the index. That is, you won’t be able to find all people whose last name is Smith and who were born on a particular date. The storage engine can’t optimize accesses with any columns to the right of the first range condition. If your query is WHERE last_name=&quot;Smith&quot; AND first_name LIKE 'J%' AND dob='1976-12-23', the index access will use only the first two columns in the index, because the LIKE is a range condition.
  • 71. Hash indexes A hash index is built on a hash table and is useful only for exact lookups that use every column in the index.* For each row, the storage engine computes a hash code of the indexed columns, which is a small value that will probably differ from the hash codes computed for other rows with different key values. It stores the hash codes in the index and stores a pointer to each row in a hash table. CREATE TABLE testhash ( fname VARCHAR(50) NOT NULL, lname VARCHAR(50) NOT NULL, KEY USING HASH(fname) ) ENGINE=MEMORY;
  • 73. Hash indexes Now suppose the index uses an imaginary hash function called f( ), which returns the following values (these are just examples, not real values): f('Arjen') = 2323 f('Baron') = 7437 f('Peter') = 8784 f('Vadim') = 2458
  • 74. Hash indexes mysql> SELECT lname FROM testhash WHERE fname='Peter';
  • 75. Hash indexes have some limitations Because the index contains only hash codes and row pointers rather than the values themselves, MySQL can’t use the values in the index to avoid reading the rows. MySQL can’t use hash indexes for sorting because they don’t store rows in sorted order. Hash indexes don’t support partial key matching, because they compute the hash from the entire indexed value. Hash indexes support only equality comparisons that use the =, IN( ) operators. They can’t speed up range queries, such as WHERE price > 100. Accessing data in a hash index is very quick, unless there are many collisions (multiple values with the same hash). When there are collisions, the storage engine must follow each row pointer in the linked list and compare their values to the lookup value to find the right row(s). Some index maintenance operations can be slow if there are many hash collisions.
  • 76. SPATIAL INDEXES MyISAM supports spatial indexes, which you can use with geospatial types such as GEOMETRY. Unlike B-Tree indexes, spatial indexes don’t require your WHERE clauses to operate on a leftmost prefix of the index. They index the data by all dimensions at the same time. As a result, lookups can use any combination of dimensions efficiently. However, you must use the MySQL GIS functions, such as MBRCONTAINS( ),for this to work.
  • 77. FULLTEXT INDEX FULLTEXT is a special type of index for MyISAM tables. It finds keywords in the text instead of comparing values directly to the values in the index. Full-text searching is completely different from other types of matching. It has many subtleties, such as stopwords, stemming and plurals, and Boolean searching. It is much more analogous to what a search engine does than to simple WHERE parameter matching. Having a full-text index on a column does not eliminate the value of a B-Tree index on the same column. Full-text indexes are for MATCH AGAINST operations, not ordinary WHERE clause operations.
  • 78. Hash indexes have some limitations Because the index contains only hash codes and row pointers rather than the values themselves, MySQL can’t use the values in the index to avoid reading the rows. MySQL can’t use hash indexes for sorting because they don’t store rows in sorted order. Hash indexes don’t support partial key matching, because they compute the hash from the entire indexed value. Hash indexes support only equality comparisons that use the =, IN( ) operators. They can’t speed up range queries, such as WHERE price > 100. Accessing data in a hash index is very quick, unless there are many collisions (multiple values with the same hash). When there are collisions, the storage engine must follow each row pointer in the linked list and compare their values to the lookup value to find the right row(s). Some index maintenance operations can be slow if there are many hash collisions.
  • 79. Indexing Strategies for High Performance Isolate the Column Prefix Indexes and Index Selectivity Clustered Indexes Comparison of InnoDB and MyISAM data layout Inserting rows in primary key order with InnoDB Covering Index Using Index Scans for Sorts Packed(Prefix-compressed) Indexes Redundant and Duplicate Indexes Indexes and Locking
  • 80. Indexing Strategies for High Performance Isolate the Column If you don’t isolate the indexed columns in a query, MySQL generally can’t use indexes on columns unless the columns are isolated in the query. “Isolating” the column means it should not be part of an expression or be inside a function in the query. Which is the best one? mysql> SELECT actor_id FROM sakila.actor WHERE actor_id + 1 = 5; mysql> SELECT ... WHERE TO_DAYS(CURRENT_DATE) - TO_DAYS(date_col) <= 10; mysql> SELECT ... WHERE date_col >= DATE_SUB(CURRENT_DATE, INTERVAL 10 DAY); mysql> SELECT ... WHERE date_col >= DATE_SUB('2008-01-17', INTERVAL 10 DAY);
  • 81. Prefix Indexes and Index Selectivity Sometimes you need to index very long character columns, which makes your indexes large and slow. One strategy is to simulate a hash index, as we showed earlier in this chapter. But sometimes that isn’t good enough. What can you do? A prefix of the column is often selective enough to give good performance. If you’re indexing BLOB or TEXT columns, or very long VARCHAR columns, you must define prefix indexes, because MySQL disallows indexing their full length.
  • 82. Cluster Indexes Clustered indexes aren’t a separate type of index. Rather, they’re an approach to data storage. The exact details vary between implementations, but InnoDB’s clustered indexes actually store a B-Tree index and the rows together in the same structure. When a table has a clustered index, its rows are actually stored in the index’s leaf pages. The term “clustered” refers to the fact that rows with adjacent key values are stored close to each other.† You can have only one clustered index per table, because you can’t store the rows in two places at once. (However, covering indexes let you emulate multiple clustered indexes; more on this later.)
  • 83. Covering Index Indexes are a way to find rows efficiently, but MySQL can also use an index to retrieve a column’s data, so it doesn’t have to read the row at all. After all, the index’s leaf nodes contain the values they index; why read the row when reading the index can give you the data you want? An index that contains (or “covers”) all the data needed to satisfy a query is called a covering index.
  • 84. Using Index Scans for Sorts MySQL has two ways to produce ordered results: it can use a filesort, or it can scan an index in order.* You can tell when MySQL plans to scan an index by looking for “index” in the type column in EXPLAIN. (Don’t confuse this with “Using index” in the Extra column.) Ordering the results by the index works only when the index’s order is exactly the same as the ORDER BY clause and all columns are sorted in the same direction (ascending or descending). If the query joins multiple tables, it works only when all columns in the ORDER BY clause refer to the first table. The ORDER BY clause also has the same limitation as lookup queries: it needs to form a leftmost prefix of the index. In all other cases, MySQL uses a filesort.
  • 85. Packed (Prefix-Compressed) Indexes MyISAM uses prefix compression to reduce index size, allowing more of the index to fit in memory and dramatically improving performance in some cases. It packs string values by default, but you can even tell it to compress integer values. You can control how a table’s indexes are packed with the PACK_KEYS option to CREATE TABLE.
  • 86. Redundant and Duplicate Indexes CREATE TABLE test ( ID INT NOT NULL PRIMARY KEY, UNIQUE(ID), INDEX(ID) );
  • 87. Indexes and Locking Indexes play a very important role for InnoDB, because they let queries lock fewer rows. This is an important consideration, because in MySQL 5.0 InnoDB never unlocks a row until the transaction commits. If your queries never touch rows they don’t need, they’ll lock fewer rows, and that’s better for performance for two reasons. First, even though InnoDB’s row locks are very efficient and use very little memory, there’s still some overhead involved in row locking. Secondly, locking more rows than needed increases lock contention and reduces concurrency.
  • 88. Summary of index strategies Now that you’ve learned more about indexing, perhaps you’re wondering where to get started with your own tables. The most important thing to do is examine the queries you’re going to run most often, but you should also think about less-frequent operations,such as inserting and updating data. Try to avoid the common mistake of creating indexes without knowing which queries will use them, and consider whether all your indexes together will form an optimal configuration. Sometimes you can just look at your queries, and see which indexes they need, add them, and you’re done. But sometimes you’ll have enough different kinds of queries that you can’t add perfect indexes for them all, and you’ll need to compromise. To find the best balance, you should benchmark and profile. The first thing to look at is response time. Consider adding an index for any query that’s taking too long. Then examine the queries that cause the most load , and add indexes to support them. If your system is approaching a memory, CPU, or disk bottleneck, take that into account. For example, if you do a lot of long aggregate queries to generate summaries, your disks might benefit from covering indexes that support GROUP BY queries. Where possible, try to extend existing indexes rather than adding new ones. It is usually more efficient to maintain one multicolumn index than several single-column indexes. If you don’t yet know your query distribution, strive to make your indexes as selective as you can, because highly selective indexes are usually more beneficial.
  • 89. Index and Table Maintenance Finding and Repairing Table Corruption Updating Index Statistics Reducing Index and Data Fragmentation
  • 90. Normalization and Denormalization Pros and Cons of a Normalized Schema Pros and Cons of a Denormalized Schema A Mixture of Normalized and Denormalized Cache and Summary Tables Counter tables
  • 91. Speeding Up ALTER TABLE Modifying Only the .frm File Building MyISAM Indexes Quickly
  • 92. Notes on Storage Engines The MyISAM Storage Engine Table locks No automated data recovery No transactions Only indexes are cached in memory Compact storage
  • 93. The Memory Storage Engine Table locks Like MyISAM tables, Memory tables have table locks. This isn’t usually a problem though, because queries on Memory tables are normally fast. No dynamic rows Memory tables don’t support dynamic (i.e., variable-length) rows, so they don’t support BLOB and TEXT fields at all. Even a VARCHAR(5000) turns into a CHAR(5000)—a huge memory waste if most values are small. Hash indexes are the default index type Unlike for other storage engines, the default index type is hash if you don’t specifyit explicitly No index statistics Memory tables don’t support index statistics, so you may get bad execution plans for some complex queries. Content is lost on restart Memory tables don’t persist any data to disk, so the data is lost when the server restarts, even though the tables’ definitions remain.

Editor's Notes

  • #3: The 2 nd layer also called the brain of Mysql. Deal with parsing ,analysis,optimization,caching, and built-in the functions
  • #4: The 2 nd layer also called the brain of Mysql. Deal with parsing ,analysis,optimization,caching, and built-in the functions
  • #5: Security socket layer
  • #7: Granularity : 粒度
  • #9: Dirty read 只能够读到已经提交的数据,但是不能保证另外一个事务中,相同两次读取结果一致 保证结果一致,但是你选了几行之后,如果别人在这几行中间插入数据,那么你选的就变了 SERIALIZABLE 保证了你选的那个序列, SERIALIZABLE places a lock on every row it reads. 非常少用
  • #13: DDL : Data Definition Language
  • #18: MVCC to achieve high concurrency, InnoDB can store each table’s data and indexes in separate files. InnoDB can also use raw disk partitions for building its tablespace.
  • #36: such as “think time” on a web page.
  • #42: Sporadically 偶尔的 , 零星的
  • #46: Query Cache
  • #77: Spatial, 立体的