字符串字段误使用数字进行查询,会导致隐式类型转换,无法命中索引的坑我相信多数大佬都踩过。
我们来思考以下场景的组合

  1. 主键索引是varchar,where条件使用int
  2. 主键索引是bigint,where条件使用string
  3. 普通索引是varchar,where条件使用int
  4. 普通索引是bigint,where条件使用string
  5. 唯一索引是varchar,where条件使用int
  6. 唯一索引是bigint,where条件使用string

通过测试来看一下这一组数据。使用的数据库版本是MySQL8.0

mysql> select version();
+-----------+
| version() |
+-----------+
| 8.0.27    |
+-----------+

表结构为

mysql> desc `index_test`;
+-------------------+-----------------+------+-----+---------+----------------+
| Field             | Type            | Null | Key | Default | Extra          |
+-------------------+-----------------+------+-----+---------+----------------+
| id                | bigint unsigned | NO   | PRI | NULL    | auto_increment |
| char_index_normal | varchar(50)     | NO   | MUL |         |                |
| int_index_normal  | bigint          | NO   | MUL | 0       |                |
| char_index_unique | varchar(50)     | NO   | UNI |         |                |
| int_index_unique  | bigint          | NO   | UNI | 0       |                |
+-------------------+-----------------+------+-----+---------+----------------+
create table `index_test`
(
    `id`                bigint(20) unsigned not null auto_increment,
    `char_index_normal` varchar(50)         not null default '',
    `int_index_normal`  bigint(20)          not null default 0,
    `char_index_unique` varchar(50)         not null default '',
    `int_index_unique`  bigint(20)          not null default 0,
    primary key (`id`),
    key `idx_char_normal` (`char_index_normal`),
    key `idx_int_normal` (`int_index_normal`),
    unique key `idx_char_unique` (`char_index_unique`),
    unique key `idx_int_unique` (`int_index_unique`)
) engine = InnoDB
  default charset = utf8mb4

往表中插入3000条测试数据。

1.主键索引是varchar,where条件使用int

由于上面的表的主键是bigint,所以需要重新创建一张表将数据类型改为varchar,其他结构无影响

mysql> explain select * from index_test where id = 2000;
+----+-------------+------------+------------+------+---------------+------+---------+------+------+----------+-------------+
| id | select_type | table      | partitions | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra       |
+----+-------------+------------+------------+------+---------------+------+---------+------+------+----------+-------------+
|  1 | SIMPLE      | index_test | NULL       | ALL  | PRIMARY       | NULL | NULL    | NULL | 3057 |    10.00 | Using where |
+----+-------------+------------+------------+------+---------------+------+---------+------+------+----------+-------------+

可以看到,索引失效。

2.主键索引是bigint,where条件使用string

我们执行这这条语句

mysql> explain select * from index_test where id = '20';
+----+-------------+------------+------------+-------+---------------+---------+---------+-------+------+----------+-------+
| id | select_type | table      | partitions | type  | possible_keys | key     | key_len | ref   | rows | filtered | Extra |
+----+-------------+------------+------------+-------+---------------+---------+---------+-------+------+----------+-------+
|  1 | SIMPLE      | index_test | NULL       | const | PRIMARY       | PRIMARY | 8       | const |    1 |   100.00 | NULL  |
+----+-------------+------------+------------+-------+---------------+---------+---------+-------+------+----------+-------+

查询走了索引。

3.普通索引是varchar,where条件使用int

执行以下语句

mysql> explain select * from index_test where char_index_normal = 22647085336509644766;
+----+-------------+------------+------------+------+-----------------+------+---------+------+------+----------+-------------+
| id | select_type | table      | partitions | type | possible_keys   | key  | key_len | ref  | rows | filtered | Extra       |
+----+-------------+------------+------------+------+-----------------+------+---------+------+------+----------+-------------+
|  1 | SIMPLE      | index_test | NULL       | ALL  | idx_char_normal | NULL | NULL    | NULL | 3000 |    10.00 | Using where |
+----+-------------+------------+------------+------+-----------------+------+---------+------+------+----------+-------------+

可以看到,varchar类型的字段使用int的where进行查询时,索引失效。

4.普通索引是bigint,where条件使用string

执行以下sql

mysql> explain select * from index_test where int_index_normal = '3113859900';
+----+-------------+------------+------------+------+----------------+----------------+---------+-------+------+----------+-----------------------+
| id | select_type | table      | partitions | type | possible_keys  | key            | key_len | ref   | rows | filtered | Extra                 |
+----+-------------+------------+------------+------+----------------+----------------+---------+-------+------+----------+-----------------------+
|  1 | SIMPLE      | index_test | NULL       | ref  | idx_int_normal | idx_int_normal | 8       | const |    1 |   100.00 | Using index condition |
+----+-------------+------------+------------+------+----------------+----------------+---------+-------+------+----------+-----------------------+

查询走了索引。

5.唯一索引是varchar,where条件使用int

执行以下sql

mysql> explain select * from index_test where char_index_unique = 34766472174299612279;
+----+-------------+------------+------------+------+-----------------+------+---------+------+------+----------+-------------+
| id | select_type | table      | partitions | type | possible_keys   | key  | key_len | ref  | rows | filtered | Extra       |
+----+-------------+------------+------------+------+-----------------+------+---------+------+------+----------+-------------+
|  1 | SIMPLE      | index_test | NULL       | ALL  | idx_char_unique | NULL | NULL    | NULL | 3000 |    10.00 | Using where |
+----+-------------+------------+------------+------+-----------------+------+---------+------+------+----------+-------------+

走了全表扫描,没有走索引。

6.唯一索引是bigint,where条件使用string

执行以下sql

mysql> explain select * from index_test where int_index_unique = '6853864119';
+----+-------------+------------+------------+-------+----------------+----------------+---------+-------+------+----------+-------+
| id | select_type | table      | partitions | type  | possible_keys  | key            | key_len | ref   | rows | filtered | Extra |
+----+-------------+------------+------------+-------+----------------+----------------+---------+-------+------+----------+-------+
|  1 | SIMPLE      | index_test | NULL       | const | idx_int_unique | idx_int_unique | 8       | const |    1 |   100.00 | NULL  |
+----+-------------+------------+------------+-------+----------------+----------------+---------+-------+------+----------+-------+

查询走了索引。

通过以上6条sql观察到,有的走了索引,有的没有走索引,接下来就这种情况进行分析。

【思考】隐式类型转换的规则是什么?

有一个非常简单的方法,可以验证隐式类型转换的规则,就是看 SELECT '10' > 9 和 SELECT 9 > '10' 的结果:

如果规则是 “将字符串转成数字”,那么就是做数字比较,SELECT '10' > 9 的结果应该是 1,SELECT 9 > '10' 的结果应该是 0;
如果规则是 “将数字转成字符串”,那么就是做字符串比较,SELECT '10' > 9 的结果应该是 0,SELECT 9 > '10' 的结果应该是 1。

mysql> SELECT '10' > 9;
+----------+
| '10' > 9 |
+----------+
|        1 |
+----------+

mysql> SELECT 9 > '10';
+----------+
| 9 > '10' |
+----------+
|        0 |
+----------+

可见,SELECT '10' > 9 的结果是 1,SELECT 9 > '10' 的结果是 0,所以可以确认 MySQL 类型转换规则是:当字符串和数字做比较时,会将字符串转换成数字。

因此,当id是varchar时,我们使用下面语句进行查询

select * from index_test where id = 20;

对于MySQL优化器来说,这个语句相当于将索引上的id 转换成 int 类型再与输入的值进行比较

select * from index_test where cast(id as signed int) = 20;

众所周知,如果查询时对索引字段进行函数操作,查询过程将无法使用索引

所以,无论是主键索引,唯一索引还是普通索引,只要涉及到索引上的值需要做类型转换,必定会导致索引失效。

不过更好的情况是什么呢,就是选择为索引值或者数量时,走的是索引扫描index

mysql> explain select id from index_test where id = 2000;
+----+-------------+------------+------------+-------+---------------+----------------+---------+------+------+----------+--------------------------+
| id | select_type | table      | partitions | type  | possible_keys | key            | key_len | ref  | rows | filtered | Extra                    |
+----+-------------+------------+------------+-------+---------------+----------------+---------+------+------+----------+--------------------------+
|  1 | SIMPLE      | index_test | NULL       | index | PRIMARY       | idx_int_unique | 8       | NULL | 3057 |    10.00 | Using where; Using index |
+----+-------------+------------+------------+-------+---------------+----------------+---------+------+------+----------+--------------------------+

mysql> explain select sum(1) from index_test where char_index_normal = 95368041795468887595;
+----+-------------+------------+------------+-------+-----------------+-----------------+---------+------+------+----------+--------------------------+
| id | select_type | table      | partitions | type  | possible_keys   | key             | key_len | ref  | rows | filtered | Extra                    |
+----+-------------+------------+------------+-------+-----------------+-----------------+---------+------+------+----------+--------------------------+
|  1 | SIMPLE      | index_test | NULL       | index | idx_char_normal | idx_char_normal | 202     | NULL | 3057 |    10.00 | Using where; Using index |
+----+-------------+------------+------------+-------+-----------------+-----------------+---------+------+------+----------+--------------------------+

【思考】为什么触发隐式类型转换,查询数据时就需要全表扫描

对于 InnoDB 的 B+树索引结构,我们有以下一组数据

1, 2, 3, 4, 6, 6, 7, 11, 13, 21, 23, 39, 42, 61, 71, 
101, 201, 220, 303, 345, 411, 601, 620, 701, 1402, 3333

当作为数值类型建立索引时,B+树索引结构如下:
image

当作为字符串类型建立索引时,数据顺序和B+树索引结构如下:

1, 101, 11, 13, 1402, 2, 201, 21, 220, 23, 3, 303, 3333, 
345, 39, 4, 411, 42, 6, 6, 601, 61, 620, 7, 701, 71

image-1654012833935
B+ 树索引的快速定位能力,来源于同一层兄弟节点的有序性。对索引字段做函数操作,可能会破坏索引值的有序性。

而当字符串和数字做比较时,会将字符串转换成数字,顺序就会被破坏。

MySQL手册上有详细描述,其中一个观点是,如果要把一个数字转成字符串,那它的可能性太多,所以就选择了不走索引。

To retrieve rows from other tables when performing joins. MySQL can use indexes on columns more efficiently if they are declared as the same type and size. In this context, VARCHAR and CHAR are considered the same if they are declared as the same size. For example, VARCHAR(10) and CHAR(10) are the same size, but VARCHAR(10) and CHAR(15) are not.

For comparisons between nonbinary string columns, both columns should use the same character set. For example, comparing a utf8 column with a latin1 column precludes use of an index.

Comparison of dissimilar columns (comparing a string column to a temporal or numeric column, for example) may prevent use of indexes if values cannot be compared directly without conversion. For a given value such as 1 in the numeric column, it might compare equal to any number of values in the string column such as ‘1’, ’ 1’, ‘00001’, or ‘01.e1’. This rules out use of any indexes for the string column.

引用

MySQL为什么varchar字段用数字查无法命中索引,而int字段用字符串查却能命中?