quot;SELECT COUNT(*)quot; is slow, even with where clause(“SELECT COUNT(*)很慢,即使有 where 子句)
问题描述
我想弄清楚如何优化 MySQL 中非常慢的查询(我没有设计这个):
I'm trying to figure out how to optimize a very slow query in MySQL (I didn't design this):
SELECT COUNT(*) FROM change_event me WHERE change_event_id > '1212281603783391';
+----------+
| COUNT(*) |
+----------+
| 3224022 |
+----------+
1 row in set (1 min 0.16 sec)
将其与完整计数进行比较:
Comparing that to a full count:
select count(*) from change_event;
+----------+
| count(*) |
+----------+
| 6069102 |
+----------+
1 row in set (4.21 sec)
解释语句在这里对我没有帮助:
The explain statement doesn't help me here:
explain SELECT COUNT(*) FROM change_event me WHERE change_event_id > '1212281603783391'\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: me
type: range
possible_keys: PRIMARY
key: PRIMARY
key_len: 8
ref: NULL
rows: 4120213
Extra: Using where; Using index
1 row in set (0.00 sec)
好吧,它仍然认为它需要大约 400 万个条目来计数,但我可以比这更快地计算文件中的行数!我不明白为什么 MySQL 要花这么长时间.
OK, it still thinks it needs roughly 4 million entries to count, but I could count lines in a file faster than that! I don't understand why MySQL is taking this long.
这是表定义:
CREATE TABLE `change_event` (
`change_event_id` bigint(20) NOT NULL default '0',
`timestamp` datetime NOT NULL,
`change_type` enum('create','update','delete','noop') default NULL,
`changed_object_type` enum('Brand','Broadcast','Episode','OnDemand') NOT NULL,
`changed_object_id` varchar(255) default NULL,
`changed_object_modified` datetime NOT NULL default '1000-01-01 00:00:00',
`modified` datetime NOT NULL default '1000-01-01 00:00:00',
`created` datetime NOT NULL default '1000-01-01 00:00:00',
`pid` char(15) default NULL,
`episode_pid` char(15) default NULL,
`import_id` int(11) NOT NULL,
`status` enum('success','failure') NOT NULL,
`xml_diff` text,
`node_digest` char(32) default NULL,
PRIMARY KEY (`change_event_id`),
KEY `idx_change_events_changed_object_id` (`changed_object_id`),
KEY `idx_change_events_episode_pid` (`episode_pid`),
KEY `fk_import_id` (`import_id`),
KEY `idx_change_event_timestamp_ce_id` (`timestamp`,`change_event_id`),
KEY `idx_change_event_status` (`status`),
CONSTRAINT `fk_change_event_import` FOREIGN KEY (`import_id`) REFERENCES `import` (`import_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8
版本:
$ mysql --version
mysql Ver 14.12 Distrib 5.0.37, for pc-solaris2.8 (i386) using readline 5.0
有什么明显的我遗漏了吗?(是的,我已经尝试过SELECT COUNT(change_event_id)",但没有性能差异).
Is there something obvious I'm missing? (Yes, I've already tried "SELECT COUNT(change_event_id)", but there's no performance difference).
推荐答案
InnoDB 使用聚集主键,因此主键与行一起存储在数据页中,而不是在单独的索引页中.为了进行范围扫描,您仍然必须扫描数据页中所有可能宽的行;请注意,此表包含一个 TEXT 列.
InnoDB uses clustered primary keys, so the primary key is stored along with the row in the data pages, not in separate index pages. In order to do a range scan you still have to scan through all of the potentially wide rows in data pages; note that this table contains a TEXT column.
我会尝试两件事:
- 运行
优化表
.这将确保数据页按排序顺序物理存储.可以想象,这可以加快对集群主键的范围扫描. - 仅在 change_event_id 列上创建一个额外的非主索引.这将在索引页中存储该列的副本,扫描速度要快得多.创建后,检查解释计划以确保它使用新索引.
- run
optimize table
. This will ensure that the data pages are physically stored in sorted order. This could conceivably speed up a range scan on a clustered primary key. - create an additional non-primary index on just the change_event_id column. This will store a copy of that column in index pages which be much faster to scan. After creating it, check the explain plan to make sure it's using the new index.
(如果 change_event_id 列从零开始递增,您可能还希望将其设为 bigint unsigned)
(you also probably want to make the change_event_id column bigint unsigned if it's incrementing from zero)
这篇关于“SELECT COUNT(*)"很慢,即使有 where 子句的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持编程学习网!
本文标题为:“SELECT COUNT(*)"很慢,即使有 where 子句


- 更改自动增量起始编号? 2021-01-01
- 在SQL中,如何为每个组选择前2行 2021-01-01
- 如何使用 pip 安装 Python MySQLdb 模块? 2021-01-01
- 如何将 SonarQube 6.7 从 MySQL 迁移到 postgresql 2022-01-01
- 以一个值为轴心,但将一行上的数据按另一行分组? 2022-01-01
- SQL 临时表问题 2022-01-01
- 如何将 Byte[] 插入 SQL Server VARBINARY 列 2021-01-01
- 使用 Oracle PL/SQL developer 生成测试数据 2021-01-01
- 远程 mySQL 连接抛出“无法使用旧的不安全身份验证连接到 MySQL 4.1+"来自 XAMPP 的错误 2022-01-01
- 导入具有可变标题的 Excel 文件 2021-01-01