Need help optimizing a lat/Lon geo search for mysql(需要帮助优化 mysql 的 lat/Lon 地理搜索)
问题描述
我有一个 mysql (5.0.22) myisam 表,其中大约有 300k 条记录,我想在 5 英里半径范围内进行纬度/经度距离搜索.
我有一个涵盖纬度/经度字段的索引,当我只选择纬度/经度时,它的速度很快(毫秒响应).但是当我选择表格中的其他字段时,速度会慢到 5-8 秒.
我正在使用 myisam 来利用全文搜索.其他索引表现良好(例如 select * from Listing where slug = 'xxxxx').
如何优化我的查询、表或索引以加快速度?
我的架构是:
CREATE TABLE `Listing` (`id` int(10) unsigned NOT NULL auto_increment,`name` varchar(125) 整理 utf8_unicode_ci 默认 NULL,`phone` varchar(18) 整理 utf8_unicode_ci 默认 NULL,`fax` varchar(18) 整理 utf8_unicode_ci 默认 NULL,`email` varchar(55) 整理 utf8_unicode_ci 默认 NULL,`photourl` varchar(55) 整理 utf8_unicode_ci 默认 NULL,`thumburl` varchar(5) 整理 utf8_unicode_ci 默认 NULL,`website` varchar(85) 整理 utf8_unicode_ci 默认 NULL,`categoryid` int(10) 无符号默认 NULL,`addressid` int(10) 无符号默认 NULL,`deleted` tinyint(1) 默认 NULL,`status` int(10) 无符号默认 '2',`parentid` int(10) 无符号默认 NULL,`organizationid` int(10) 无符号默认 NULL,`listinginfoid` int(10) 无符号默认 NULL,`createuserid` int(10) 无符号默认 NULL,`createdate` 日期时间默认 NULL,`lasteditdate` 时间戳 NOT NULL 默认 CURRENT_TIMESTAMP 更新 CURRENT_TIMESTAMP,`lastedituserid` int(10) 无符号默认 NULL,`slug` varchar(155) 整理 utf8_unicode_ci 默认 NULL,`aclid` int(10) 无符号默认 NULL,`alt_address` varchar(80) 整理 utf8_unicode_ci 默认 NULL,`alt_website` varchar(80) 整理 utf8_unicode_ci 默认 NULL,`lat` 十进制(10,7) 默认 NULL,`lon` 十进制(10,7) 默认 NULL,`city` varchar(80) 整理 utf8_unicode_ci 默认 NULL,`state` varchar(10) 整理 utf8_unicode_ci 默认 NULL,主键(`id`),KEY `idx_fetch` USING BTREE (`slug`,`deleted`),KEY `idx_loc` (`state`,`city`),KEY `idx_org` (`organizationid`,`status`,`deleted`),KEY `idx_geo_latlon` 使用 BTREE (`status`,`lat`,`lon`),FULLTEXT KEY `idx_name` (`name`)) 引擎=MyISAM 默认字符集=utf8 排序=utf8_unicode_ci ROW_FORMAT=DYNAMIC;
我的查询是:
选择 Listing.name、Listing.categoryid、Listing.lat、Listing.lon, 3956 * 2 * ASIN(SQRT( POWER(SIN((Listing.lat - 37.369195) * pi()/180/2), 2) + COS(Listing.lat * pi()/180) * COS(37.369195 *pi()/180) * POWER(SIN((Listing.lon --122.036849) * pi()/180/2), 2) )) rawgeosearchdistance从上市在哪里Listing.status = '2'AND(在 -122.10913433498 和 -121.96456366502 之间的 Listing.lon )AND(Listing.lat 介于 37.296909665016 和 37.441480334984 之间)HAVING rawgeosearchdistance <5ORDER BY rawgeosearchdistance ASC;
在没有地理搜索的情况下解释计划:
+----+-------------+------------+--------+-----------------+-----------------+---------+------+------+--------------+|编号 |选择类型 |表|类型 |可能的键 |关键 |key_len |参考 |行 |额外 |+----+-------------+------------+-------+-----------------+-----------------+---------+------+------+--------------+|1 |简单 |上市 |范围 |idx_geo_latlon |idx_geo_latlon |19 |空 |第453章使用位置 |+----+-------------+------------+-------+-----------------+-----------------+---------+------+------+--------------+用地理搜索解释计划:
<上一页>+----+-------------+------------+-------+-----------------+-----------------+---------+------+------+------------------------------+|编号 |选择类型 |表|类型 |可能的键 |关键 |key_len |参考 |行 |额外 |+----+-------------+------------+-------+-----------------+-----------------+---------+------+------+------------------------------+|1 |简单 |上市 |范围 |idx_geo_latlon |idx_geo_latlon |19 |空 |第453章使用哪里;使用文件排序 |+----+-------------+------------+-------+-----------------+-----------------+---------+------+------+------------------------------+这是带有覆盖索引的解释计划.以正确的顺序排列列会产生很大的不同:
<上一页>+----+-------------+--------+--------+---------------+----------------+----------+------+--------+---------------------------------------------------+|编号 |选择类型 |表|类型 |可能的键 |关键 |key_len |参考 |行 |额外 |+----+-------------+--------+--------+---------------+----------------+----------+------+--------+---------------------------------------------------+|1 |简单 |上市 |范围 |idx_geo_cover |idx_geo_cover |12 |空 |第453章使用哪里;使用索引;使用文件排序 |+----+-------------+--------+--------+---------------+----------------+----------+------+--------+---------------------------------------------------+
谢谢!
您可能在仅 lat/lon 查询中使用了覆盖索引".当查询使用的索引包含您选择的数据时,就会出现覆盖索引.MySQL 只需要访问索引而不需要访问数据行.查看更多信息.这可以解释为什么纬度/经度查询如此之快.
我怀疑计算和返回的行数会减慢更长的查询速度.(加上必须为 having 子句创建的任何临时表).
I have a mysql (5.0.22) myisam table with roughly 300k records in it and I want to do a lat/lon distance search within a five mile radius.
I have an index that covers the lat/lon fields and is fast (milisecond response) when I just select for lat/lon. But when I select for additional fields in the table is slows down horribly to 5-8 seconds.
I'm using myisam to take advantage of fulltext search. The other indexes perform well (e.g. select * from Listing where slug = 'xxxxx').
How can I optimize my query, table or index to speed things up?
My schema is:
CREATE TABLE `Listing` (
`id` int(10) unsigned NOT NULL auto_increment,
`name` varchar(125) collate utf8_unicode_ci default NULL,
`phone` varchar(18) collate utf8_unicode_ci default NULL,
`fax` varchar(18) collate utf8_unicode_ci default NULL,
`email` varchar(55) collate utf8_unicode_ci default NULL,
`photourl` varchar(55) collate utf8_unicode_ci default NULL,
`thumburl` varchar(5) collate utf8_unicode_ci default NULL,
`website` varchar(85) collate utf8_unicode_ci default NULL,
`categoryid` int(10) unsigned default NULL,
`addressid` int(10) unsigned default NULL,
`deleted` tinyint(1) default NULL,
`status` int(10) unsigned default '2',
`parentid` int(10) unsigned default NULL,
`organizationid` int(10) unsigned default NULL,
`listinginfoid` int(10) unsigned default NULL,
`createuserid` int(10) unsigned default NULL,
`createdate` datetime default NULL,
`lasteditdate` timestamp NOT NULL default CURRENT_TIMESTAMP on update CURRENT_TIMESTAMP,
`lastedituserid` int(10) unsigned default NULL,
`slug` varchar(155) collate utf8_unicode_ci default NULL,
`aclid` int(10) unsigned default NULL,
`alt_address` varchar(80) collate utf8_unicode_ci default NULL,
`alt_website` varchar(80) collate utf8_unicode_ci default NULL,
`lat` decimal(10,7) default NULL,
`lon` decimal(10,7) default NULL,
`city` varchar(80) collate utf8_unicode_ci default NULL,
`state` varchar(10) collate utf8_unicode_ci default NULL,
PRIMARY KEY (`id`),
KEY `idx_fetch` USING BTREE (`slug`,`deleted`),
KEY `idx_loc` (`state`,`city`),
KEY `idx_org` (`organizationid`,`status`,`deleted`),
KEY `idx_geo_latlon` USING BTREE (`status`,`lat`,`lon`),
FULLTEXT KEY `idx_name` (`name`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci ROW_FORMAT=DYNAMIC;
My query is:
SELECT Listing.name, Listing.categoryid, Listing.lat, Listing.lon
, 3956 * 2 * ASIN(SQRT( POWER(SIN((Listing.lat - 37.369195) * pi()/180 / 2), 2) + COS(Listing.lat * pi()/180) * COS(37.369195 * pi()/180) * POWER(SIN((Listing.lon --122.036849) * pi()/180 / 2), 2) )) rawgeosearchdistance
FROM Listing
WHERE
Listing.status = '2'
AND ( Listing.lon between -122.10913433498 and -121.96456366502 )
AND ( Listing.lat between 37.296909665016 and 37.441480334984)
HAVING rawgeosearchdistance < 5
ORDER BY rawgeosearchdistance ASC;
Explain plan without geosearch:
+----+-------------+------------+-------+-----------------+-----------------+---------+------+------+-------------+ | id | select_type | table | type | possible_keys | key | key_len |ref | rows | Extra | +----+-------------+------------+-------+-----------------+-----------------+---------+------+------+-------------+ | 1 | SIMPLE | Listing | range | idx_geo_latlon | idx_geo_latlon | 19 | NULL | 453 | Using where | +----+-------------+------------+-------+-----------------+-----------------+---------+------+------+-------------+
Explain plan with geosearch:
+----+-------------+------------+-------+-----------------+-----------------+---------+------+------+-----------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+------------+-------+-----------------+-----------------+---------+------+------+-----------------------------+ | 1 | SIMPLE | Listing | range | idx_geo_latlon | idx_geo_latlon | 19 | NULL | 453 | Using where; Using filesort | +----+-------------+------------+-------+-----------------+-----------------+---------+------+------+-----------------------------+
Here's the explain plan with the covering index. Having the columns in the correct order made a big difference:
+----+-------------+--------+-------+---------------+---------------+---------+------+--------+------------------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+--------+-------+---------------+---------------+---------+------+--------+------------------------------------------+ | 1 | SIMPLE | Listing | range | idx_geo_cover | idx_geo_cover | 12 | NULL | 453 | Using where; Using index; Using filesort | +----+-------------+--------+-------+---------------+---------------+---------+------+--------+------------------------------------------+
Thank you!
You are probably using a 'covering index' in your lat/lon only query. A covering index occurs when the index used by the query contains the data that you are selecting for. MySQL only needs to visit the index and never the data rows. See this for more info. That would explain why the lat/lon query is so fast.
I suspect that the calculations and the sheer number of rows returned, slows down the longer query. (plus any temp table that has to be created for the having clause).
这篇关于需要帮助优化 mysql 的 lat/Lon 地理搜索的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持编程学习网!
本文标题为:需要帮助优化 mysql 的 lat/Lon 地理搜索
- 使用 Oracle PL/SQL developer 生成测试数据 2021-01-01
- SQL 临时表问题 2022-01-01
- 如何将 Byte[] 插入 SQL Server VARBINARY 列 2021-01-01
- 远程 mySQL 连接抛出“无法使用旧的不安全身份验证连接到 MySQL 4.1+"来自 XAMPP 的错误 2022-01-01
- 如何使用 pip 安装 Python MySQLdb 模块? 2021-01-01
- 如何将 SonarQube 6.7 从 MySQL 迁移到 postgresql 2022-01-01
- 以一个值为轴心,但将一行上的数据按另一行分组? 2022-01-01
- 在SQL中,如何为每个组选择前2行 2021-01-01
- 更改自动增量起始编号? 2021-01-01
- 导入具有可变标题的 Excel 文件 2021-01-01