一个相同查询在不同 RDS for MySQL 实例上性能差异的案例分析-CSDN博客

本文分析了一个相同的查询在 RDS for MySQL 不同实例上表现出显著性能差异的案例。通过对执行计划的深入对比，发现了关联字段类型不匹配的问题，并给出了具体的解决方案。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

一个相同查询在不同 RDS for MySQL 实例上性能差异的案例分析

相同查询在数据量相近的情况下在不同 RDS for MySQL 实例上有不同的性能表现，容易引发用户对 RDS for MySQL 实例的性能差异性的疑虑，本文分享下近期碰到的一个原因比较隐蔽但很常见的案例。

1. 问题出现

一个用户的下述查询在 RDS for MySQL 实例 A 上执行需要 30+ 毫秒，而在 RDS for MySQL 实例 B 执行需要 12+ 秒。

SELECT A.PayId, A.sourceType, 
A.txnTime, A.txnEndTime, A.invid, A.payStatus, 
A.invstatus makeinvoicestatus, A.createTime, B.invstatus invstatus, F.returncode returncode
FROM Pay A
LEFT JOIN 
(
  SELECT  M.invstatus invstatus,  M.PayId PayId,   M.invid invid
  FROM  inv_msg M
  WHERE M.sourcetype != '03'
) B ON A.PayId = B.PayId
LEFT JOIN 
(
  SELECT C.invid invoiceids,  C.returncode, C.creatime
  FROM inv_detail C,
    (
      SELECT D.invid invoiceids, max(D.creatime) creatime
      FROM inv_detail D
      GROUP BY  D.invid
    ) E
  WHERE C.invid = E.invoiceids
  AND C.creatime = E.creatime
) F ON B.invid = F.invoiceids
WHERE A.deleteStatus = 0
AND A.payStatus IN ( '904', '905', '906', '907','908','909' )
AND A.sourceType IN ('01', '02')
ORDER BY txnTime DESC
LIMIT 0,10;

2. 问题原因

排查 SQL 在 RDS for MySQL 实例 A 和 B 上的执行计划，发现不一致。
执行时间长 - A ：

id	select_type	table		type	possible_keys			key			key_len		ref				rows	Extra
1	PRIMARY	    A	    	ALL		payStatus,sourceType	NULL		NULL		NULL			26427	Using where; Using temporary; Using filesort
1	PRIMARY	    <derived2>	ALL		NULL					NULL		NULL		NULL			8737	Using where; Using join buffer (Block Nested Loop)
1	PRIMARY	    <derived3>	ref		<auto_key0>				<auto_key0>	8			B.invid			10		NULL
3	DERIVED	    <derived4>	ALL		NULL					NULL		NULL		NULL			10694	NULL
3	DERIVED	    C			ref		invid					invid		8			F.invoiceids	1		Using where
4	DERIVED	    D			index	invid					invid		8			NULL			10694	NULL
2	DERIVED	    M			ALL		NULL					NULL		NULL		NULL			8737	Using where

执行时间短 - B：

id	select_type	table		type	possible_keys			key			key_len		ref							rows	Extra
1	PRIMARY		A			index	payStatus,sourceType	txnTime		6			NULL						1		Using where
1	PRIMARY		<derived2>	ref		<auto_key1>				<auto_key1>	8			pc_vqgc_0000.A.unionPayId	15		NULL
1	PRIMARY		<derived3>	ref		<auto_key0>				<auto_key0>	8			B.invid						10		NULL
3	DERIVED		<derived4>	ALL		NULL					NULL		NULL		NULL						10506	NULL
3	DERIVED		C			ref		invid					invid		8			F.invoiceids				1		Using where
4	DERIVED		D			index	invid					invid		8			NULL						10506	NULL
2	DERIVED		M			ALL		sourcetype				NULL		NULL		NULL						8928	Using where

从执行计划对比看问题出现在表 A 和中间表 B 关联这步。

执行计划 A 的 Extra 信息显示 Using join buffer （Block Nested Loop），说明如果选择单纯的 Nested Loop Join 成本会很高（在内层循环无法使用索引的场景下，成本是 O(Rn x Sn)）。

优化器为了提高效率，因此选择了 Block Nested Loop。

对比执行计划 B，内层使用的索引是 MySQL 自动创建的(auto_key1)，检查优化器开关配置是否有区别，以防万一。

// 检查优化器开关配置
show global variables like 'optimizer_switch' \G

*************************** 1. row ***************************
Variable_name: optimizer_switch
        Value: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,engine_condition_pushdown=on,index_condition_pushdown=on,mrr=on,mrr_cost_based=on,block_nested_loop=on,batched_key_access=off,materialization=on,semijoin=on,loosescan=on,firstmatch=on,subquery_materialization_cost_based=on,use_index_extensions=on

// 输出格式化后
index_merge=on
index_merge_union=on
index_merge_sort_union=on
index_merge_intersection=on
engine_condition_pushdown=on
index_condition_pushdown=on
mrr=on
mrr_cost_based=on
block_nested_loop=on
batched_key_access=off
materialization=on
semijoin=on
loosescan=on
firstmatch=on
subquery_materialization_cost_based=on
use_index_extensions=on

对比两个实例优化器开关配置相同，且 materialization 和 subquery_materialization_cost_based 都已经打开，加之执行计划 A 中有物化表的使用，因此排除掉优化器开关配置问题。

此时问题就比较明朗了，应该是关联的两个字段类型不匹配，导致无法通过索引物化临时表的关联字段来使用 Nested Loop Join。

带着上面的怀疑检查下两个实例的表 Pay 和 inv_msg 的关联字段 PayId 的字段类型。

// ========= 执行快 ========= 
// 表 pay
CREATE TABLE `pay` (
  `PayId` bigint(20) NOT NULL AUTO_INCREMENT,
  `companyId` bigint(20) DEFAULT NULL,
  .......
  `txnEndTime` datetime DEFAULT NULL,
  `deleteStatus` varchar(255) DEFAULT '0',
  PRIMARY KEY (`unionPayId`),
  KEY `companyId` (`companyId`) USING BTREE,
  KEY `invid` (`invId`) USING BTREE,
  KEY `payStatus` (`payStatus`) USING BTREE,
  KEY `sourceType` (`sourceType`) USING BTREE,
  KEY `txnTime` (`txnTime`) USING BTREE
) ENGINE=InnoDB AUTO_INCREMENT=24906 DEFAULT CHARSET=utf8

// 表 inv_msg
CREATE TABLE `inv_msg ` (

`invid` bigint(20) NOT NULL AUTO_INCREMENT,
  `payid` bigint(20) NOT NULL,
  ......
  `invoicestatus` varchar(2) NOT NULL DEFAULT '0',
  `sourcetype` varchar(200) NOT NULL',
  PRIMARY KEY (`invoiceid`),
  KEY `unionpayid` (`unionpayid`) USING BTREE,
  KEY `invoicestatus` (`invoicestatus`) USING BTREE,
  KEY `sourcetype` (`sourcetype`,`unionpayid`) USING BTREE
) ENGINE=InnoDB AUTO_INCREMENT=8897 DEFAULT CHARSET=utf8

// ========= 执行慢 ========= 
// 表 pay
CREATE TABLE `pay` (
  `PayId` bigint(20) NOT NULL AUTO_INCREMENT,
  `companyId` bigint(20) DEFAULT NULL,
  .......
  `txnEndTime` datetime DEFAULT NULL,
  `deleteStatus` varchar(255) DEFAULT '0',
  PRIMARY KEY (`unionPayId`),
  KEY `companyId` (`companyId`) USING BTREE,
  KEY `invid` (`invId`) USING BTREE,
  KEY `payStatus` (`payStatus`) USING BTREE,
  KEY `sourceType` (`sourceType`) USING BTREE,
  KEY `txnTime` (`txnTime`) USING BTREE
) ENGINE=InnoDB AUTO_INCREMENT=24906 DEFAULT CHARSET=utf8

// 表 inv_msg
CREATE TABLE `inv_msg ` (

`invid` bigint(20) NOT NULL AUTO_INCREMENT,
  `payid` varchar(20) NOT NULL,
  ......
  `invoicestatus` varchar(2) NOT NULL DEFAULT '0',
  `sourcetype` varchar(200) NOT NULL',
  PRIMARY KEY (`invoiceid`),
  KEY `unionpayid` (`unionpayid`) USING BTREE,
  KEY `invoicestatus` (`invoicestatus`) USING BTREE,
  KEY `sourcetype` (`sourcetype`,`unionpayid`) USING BTREE
) ENGINE=InnoDB AUTO_INCREMENT=8897 DEFAULT CHARSET=utf8

可以看到 payId 字段在执行快场景下 2 个表都是 big int 类型；而在执行慢的场景下，2个表的字段类型分别为 big int 和 varchar，导致执行计划选择了对无法使用索引场景优化的 Block Netsted Loop。

3. 问题解决

理清问题的根源，就有了针对性的方法。
建议用户修改表 inv_msg 的字段 payid 类型为 big int not null，重新收集统计信息后问题解决。

// 业务低峰期执行
alter table inv_msg algorithm=copy, lock=shared, modify payid bigint not null;  

// 重新收集统计信息
analyze table inv_msg;

4. 问题结论

需要严格遵守规范进行开发工作。
用户 DBA 应该进行 SQL 审核工作。