Mysql优化器对in list的处理
select * from table where id in (....)
这样的查询,是走范围索引还是走等值索引?
select * from table where key_part1 in (....) and key_part2='XX';
这样的查询,第二部分还走不走索引?
测试目的,想知道,MYSQL对IN LIST是如何选择执行计划的;在单字段索引和复合索引中;
[@more@]
mysql 5.1.40
os:rhel 5.4
engine=innodb
innodb_file_per_table
# 先来创建测试环境:
create table index_test ( id int auto_increment , col1 int ,col2 varchar(200) ,content varchar(500),primary key (id) ,key col1 (col1) ) engine=innodb default charset=latin1;
# repeat insert operation 12 times
insert into index_test (col1,col2) select @rownum:=@rownum+1,column_name from information_schema.COLUMNS c , (select @rownum:=0 ) id limit 500 ;
# 测试1:先测对主键的IN操作;
# 测试用例:
reset query cache; --清空QUERY_CAHCE
show status like 'Innodb_buffer_pool_read_requests' ; --用来查询逻辑读
select * from index_test where id in (2,10,1000,2000,9000);
show status like 'Innodb_buffer_pool_read_requests' ; --与前面的结果相减,就得到SQL执行所带来的逻辑读 ;
为了逻辑读的准确性, 对同一个SQL你应该多跑几次,以去掉物理读 ;
root@127.0.0.1 : test 16:02:16> explain select * from index_test where id in (2,10,1000,2000);
+----+-------------+------------+-------+---------------+---------+---------+------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+------------+-------+---------------+---------+---------+------+------+-------------+
| 1 | SIMPLE | index_test | range | PRIMARY | PRIMARY | 4 | NULL | 4 | Using where |
+----+-------------+------------+-------+---------------+---------+---------+------+------+-------------+
1 row in set (0.00 sec)
# 从执行计划看,走的是范围条件;但我们看看实际情况 :
# 注意,为了减少篇幅,我把各个查询的结果给删减了。
select * from index_test where id in (2,10);
RESULTs: 2 rows
LIO : 4
select * from index_test where id in (2,1000);
RESULTs: 2 rows
LIO : 4
select * from index_test where id in (2,10,100);
RESULTs: 3 rows
LIO : 6
select * from index_test where id in (2,10,1000,2000);
RESULTs: 4 rows
LIO : 8
select * from index_test where id in (2,10,1000,2000,9000);
RESULTs: 5 rows
LIO : 10
### 在这里看到,逻辑读根据IN LIST里KEY的数量成线性增加,而没有根据KEY值的大小变化,所以我们判断,对主键的IN操作,其实都转成了OR操作。
# 测试2:对非主键的IN操作;
# 测试用例:
reset query cache;
show status like 'Innodb_buffer_pool_read_requests' ;
select * from index_test where col1 in (100,500,300,400);
show status like 'Innodb_buffer_pool_read_requests' ;
root@127.0.0.1 : test 16:06:33> explain select * from index_test where col1 in (100,200);
+----+-------------+------------+-------+---------------+------+---------+------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+------------+-------+---------------+------+---------+------+------+-------------+
| 1 | SIMPLE | index_test | range | col1 | col1 | 5 | NULL | 24 | Using where |
+----+-------------+------------+-------+---------------+------+---------+------+------+-------------+
1 row in set (0.00 sec)
select * from index_test where col1 in (100,101);
RESULTs: 24 rows
LIO : 86
select * from index_test where col1 in (100,500);
RESULTs: 24 rows
LIO : 86
select * from index_test where col1 in (100,500,300);
RESULTs: 36 rows
LIO : 139
select * from index_test where col1 in (100,500,300,400);
RESULTs: 48 rows
LIO : 172
分析: 这个结果与测试1的结果是一样的;
# 测试3:对复合索引的前列IN操作;
alter table index_test drop index col1 ,add index col1col2(col1,col2) ;
update index_test set content=concat(col2,col3,col1) ;
主要是测一下,索引的第一个字段用IN后,优化器还会不会使用第二个字段来进行索引搜索;
root@127.0.0.1 : test 18:41:38> explain select content from index_test where col1 in (100,500,300,400) and col2='aaaa';
+----+-------------+------------+-------+---------------+----------+---------+------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+------------+-------+---------------+----------+---------+------+------+-------------+
| 1 | SIMPLE | index_test | range | col1col2 | col1col2 | 208 | NULL | 4 | Using where |
+----+-------------+------------+-------+---------------+----------+---------+------+------+-------------+
1 row in set (0.00 sec)
select count(*) from index_test where col1 in (100,500,300,400) and col2='aaaa';
RESULTs: 0 rows
LIO : 24
select content from index_test where col1 in (100,500,300,400) and col2='aaaa';
RESULTs: 0 rows
LIO : 24
分析:
#我们发现,两个查询的逻辑读是一样,其实这已经表明优化器用上了索引的第二个字段,在索引搜索部分就完成了对COL2的过滤;
总结:MYSQL优化器对in list是转成“or” 的“多个等值”查询来处理的;并没有转成范围查询 ;
免责声明:
① 本站未注明“稿件来源”的信息均来自网络整理。其文字、图片和音视频稿件的所属权归原作者所有。本站收集整理出于非商业性的教育和科研之目的,并不意味着本站赞同其观点或证实其内容的真实性。仅作为临时的测试数据,供内部测试之用。本站并未授权任何人以任何方式主动获取本站任何信息。
② 本站未注明“稿件来源”的临时测试数据将在测试完成后最终做删除处理。有问题或投稿请发送至: 邮箱/279061341@qq.com QQ/279061341