使用EXPLAIN编写更好MySQL查询

tech2023-11-11  94

When you issue a query, the MySQL Query Optimizer tries to devise an optimal plan for query execution. You can see information about the plan by prefixing the query with EXPLAIN. EXPLAIN is one of the most powerful tools at your disposal for understanding and optimizing troublesome MySQL queries, but it’s a sad fact that many developers rarely make use of it. In this article you’ll learn what the output of EXPLAIN can be and how to use it to optimize your schema and queries.

发出查询时,MySQL Query Optimizer尝试为查询执行制定最佳计划。 您可以通过在查询前面加上EXPLAIN来查看有关计划的信息。 EXPLAIN是您可以用来了解和优化麻烦MySQL查询的最强大的工具之一,但是令人遗憾的是,许多开发人员很少使用它。 在本文中,您将学习EXPLAIN的输出以及如何使用它来优化模式和查询。

了解EXPLAIN的输出 (Understanding EXPLAIN’s Output)

Using EXPLAIN is as simple as pre-pending it before the SELECT queries. Let’s analyze the output of a simple query to familiarize yourself with the columns returned by the command.

使用EXPLAIN就像在SELECT查询之前预先添加它一样简单。 让我们分析一个简单查询的输出,以熟悉该命令返回的列。

EXPLAIN SELECT * FROM categoriesG ********************** 1. row ********************** id: 1 select_type: SIMPLE table: categories type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 4 Extra: 1 row in set (0.00 sec)

It may not seem like it, but there’s a lot of information packed into those 10 columns! The columns returned by the query are:

看起来好像不是,但那10列中包含很多信息! 查询返回的列为:

id – a sequential identifier for each SELECT within the query (for when you have nested subqueries)

id –查询中每个SELECT的顺序标识符(用于嵌套子查询时)

select_type – the type of SELECT query. Possible values are:

select_type – SELECT查询的类型。 可能的值为:

SIMPLE – the query is a simple SELECT query without any subqueries or UNIONs

SIMPLE –查询是一个简单的SELECT查询,没有任何子查询或UNION

PRIMARY – the SELECT is in the outermost query in a JOIN

PRIMARY – SELECT在JOIN最外面的查询中

DERIVED – the SELECT is part of a subquery within a FROM clause

DERIVED – SELECT是FROM子句中子查询的一部分

SUBQUERY – the first SELECT in a subquery

SUBQUERY –子查询中的第一个SELECT

DEPENDENT SUBQUERY – a subquery which is dependent upon on outer query

DEPENDENT子查询–依赖于外部查询的子查询 UNCACHEABLE SUBQUERY – a subquery which is not cacheable (there are certain conditions for a query to be cacheable)

UNCACHEABLE子查询-不可缓存的子查询(在某些条件下,查询可缓存)

UNION – the SELECT is the second or later statement of a UNION

UNION – SELECT是UNION的第二个或更高版本的语句

DEPENDENT UNION – the second or later SELECT of a UNION is dependent on an outer query

DEPENDENT UNION -第二次或以后SELECT一个的UNION是依赖于外部查询

UNION RESULT – the SELECT is a result of a UNION

联合的结果-在SELECT是的结果UNION

select_type – the type of SELECT query. Possible values are:

select_type – SELECT查询的类型。 可能的值为:

table – the table referred to by the row

table –该行引用的表

type – how MySQL joins the tables used. This is one of the most insightful fields in the output because it can indicate missing indexes or how the query is written should be reconsidered. Possible values are:

type -MySQL如何联接使用的表。 这是输出中最具洞察力的字段之一,因为它可以指示缺少索引或应该重新考虑查询的编写方式。 可能的值为:

system – the table has only zero or one row

系统-表格只有零行或一行 const – the table has only one matching row which is indexed. This is the fastest type of join because the table only has to be read once and the column’s value can be treated as a constant when joining other tables.

const –该表只有一个匹配的行被索引。 这是最快的连接类型,因为该表只需要读取一次,并且在连接其他表时可以将列的值视为常量。

eq_ref – all parts of an index are used by the join and the index is PRIMARY KEY or UNIQUE NOT NULL. This is the next best possible join type.

eq_ref –联接使用索引的所有部分,索引为PRIMARY KEY或UNIQUE NOT NULL 。 这是第二种可能的最佳联接类型。

ref – all of the matching rows of an indexed column are read for each combination of rows from the previous table. This type of join appears for indexed columns compared using = or <=> operators.

ref –从上一个表中为行的每种组合读取索引列的所有匹配行。 对于使用=或<=>运算符进行比较的索引列,将出现这种类型的<=> 。

fulltext – the join uses the table’s FULLTEXT index.

全文本- FULLTEXT使用表的FULLTEXT索引。

ref_or_null – this is the same as ref but also contains rows with a null value for the column.

ref_or_null –与ref相同,但也包含具有该列的null值的行。

index_merge – the join uses a list of indexes to produce the result set. The key column of EXPLAIN‘s output will contain the keys used.

index_merge –联接使用索引列表来生成结果集。 EXPLAIN输出的键列将包含使用的键。

unique_subquery – an IN subquery returns only one result from the table and makes use of the primary key.

unique_subquery – IN子查询仅从表中返回一个结果,并使用主键。

index_subquery – the same as unique_subquery but returns more than one result row.

index_subquery –与unique_subquery相同,但返回多个结果行。

range – an index is used to find matching rows in a specific range, typically when the key column is compared to a constant using operators like BETWEEN, IN, >, >=, etc.

范围–索引用于查找特定范围内的匹配行,通常是使用BETWEEN , IN , > , >=等运算符将键列与常量进行比较时。

index – the entire index tree is scanned to find matching rows.

index –扫描整个索引树以找到匹配的行。 all – the entire table is scanned to find matching rows for the join. This is the worst join type and usually indicates the lack of appropriate indexes on the table.

全部–扫描整个表以查找联接的匹配行。 这是最差的联接类型,通常表明表上缺少适当的索引。

type – how MySQL joins the tables used. This is one of the most insightful fields in the output because it can indicate missing indexes or how the query is written should be reconsidered. Possible values are:

type -MySQL如何联接使用的表。 这是输出中最具洞察力的字段之一,因为它可以指示缺少索引或应该重新考虑查询的编写方式。 可能的值为:

possible_keys – shows the keys that can be used by MySQL to find rows from the table, though they may or may not be used in practice. In fact, this column can often help in optimizing queries since if the column is NULL, it indicates no relevant indexes could be found.

possible_keys –显示MySQL可以用来从表中查找行的键,尽管实际上可能会或可能不会使用它们。 实际上,此列通常可以帮助优化查询,因为如果该列为NULL,则表明找不到相关索引。

key – indicates the actual index used by MySQL. This column may contain an index that is not listed in the possible_key column. MySQL optimizer always look for an optimal key that can be used for the query. While joining many tables, it may figure out some other keys which is not listed in possible_key but are more optimal.

key –表示MySQL使用的实际索引。 此列可能包含未在列出的索引possible_key柱。 MySQL优化器始终在寻找可用于查询的最佳键。 虽然加入很多表,也可以找出哪些是没有列在其他一些关键possible_key但更优化。

key_len – indicates the length of the index the Query Optimizer chose to use. For example, a key_len value of 4 means it requires memory to store four characters. Check out MySQL’s data type storage requirements to know more about this.

key_len –指示查询优化器选择使用的索引的长度。 例如, key_len值为4表示它需要内存来存储四个字符。 查看MySQL的数据类型存储要求,以了解更多信息。

ref – Shows the columns or constants that are compared to the index named in the key column. MySQL will either pick a constant value to be compared or a column itself based on the query execution plan. You can see this in the example given below.

ref –显示与键列中命名的索引进行比较的列或常量。 MySQL将根据查询执行计划选择要比较的常数值,或者选择列本身。 您可以在下面的示例中看到这一点。

rows – lists the number of records that were examined to produce the output. This Is another important column worth focusing on optimizing queries, especially for queries that use JOIN and subqueries.

rows –列出检查以产生输出的记录数。 这是另一个重要的列,值得重点关注优化查询,尤其是对于使用JOIN和子查询的查询。

Extra – contains additional information regarding the query execution plan. Values such as “Using temporary”, “Using filesort”, etc. in this column may indicate a troublesome query. For a complete list of possible values and their meaning, check out the MySQL documentation.

Extra –包含有关查询执行计划的其他信息。 此列中的“使用临时”,“使用文件排序”等值可能表示查询很麻烦。 有关可能的值及其含义的完整列表,请查看MySQL文档 。

You can also add the keyword EXTENDED after EXPLAIN in your query and MySQL will show you additional information about the way it executes the query. To see the information, follow your EXPLAIN query with SHOW WARNINGS. This is mostly useful for seeing the query that is executed after any transformations have been made by the Query Optimizer.

您还可以在查询中的EXPLAIN之后添加关键字EXTENDED ,MySQL会向您显示有关其执行查询方式的其他信息。 要查看信息,请在EXPLAIN查询中加上SHOW WARNINGS 。 这对于查看查询优化器进行任何转换后执行的查询非常有用。

EXPLAIN EXTENDED SELECT City.Name FROM City JOIN Country ON (City.CountryCode = Country.Code) WHERE City.CountryCode = 'IND' AND Country.Continent = 'Asia'G ********************** 1. row ********************** id: 1 select_type: SIMPLE table: Country type: const possible_keys: PRIMARY key: PRIMARY key_len: 3 ref: const rows: 1 filtered: 100.00 Extra: ********************** 2. row ********************** id: 1 select_type: SIMPLE table: City type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 4079 filtered: 100.00 Extra: Using where 2 rows in set, 1 warning (0.00 sec) SHOW WARNINGSG ********************** 1. row ********************** Level: Note Code: 1003 Message: select `World`.`City`.`Name` AS `Name` from `World`.`City` join `World`.`Country` where ((`World`.`City`.`CountryCode` = 'IND')) 1 row in set (0.00 sec)

使用EXPLAIN对性能进行故障排除 (Troubleshooting Performance with EXPLAIN)

Now let’s take a look at how we can optimize a poorly performing query by analyzing the output of EXPLAIN. When dealing with a real-world application there’ll undoubtedly be a number of tables with many relations between them, but sometimes it’s hard to anticipate the most optimal way to write a query.

现在让我们看一下如何通过分析EXPLAIN的输出来优化性能不佳的查询。 在处理实际应用程序时,无疑会有许多表之间存在许多关系,但是有时很难预料到最理想的写查询方式。

Here I’ve created a sample database for an e-commerce application which does not have any indexes or primary keys, and will demonstrate the impact of such a bad design by writing a pretty awful query. You can download the schema sample from GitHub.

在这里,我为一个没有任何索引或主键的电子商务应用程序创建了一个示例数据库,并将通过编写一个非常糟糕的查询来演示这种不良设计的影响。 您可以从GitHub下载模式示例 。

EXPLAIN SELECT * FROM orderdetails d INNER JOIN orders o ON d.orderNumber = o.orderNumber INNER JOIN products p ON p.productCode = d.productCode INNER JOIN productlines l ON p.productLine = l.productLine INNER JOIN customers c on c.customerNumber = o.customerNumber WHERE o.orderNumber = 10101G ********************** 1. row ********************** id: 1 select_type: SIMPLE table: l type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 7 Extra: ********************** 2. row ********************** id: 1 select_type: SIMPLE table: p type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 110 Extra: Using where; Using join buffer ********************** 3. row ********************** id: 1 select_type: SIMPLE table: c type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 122 Extra: Using join buffer ********************** 4. row ********************** id: 1 select_type: SIMPLE table: o type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 326 Extra: Using where; Using join buffer ********************** 5. row ********************** id: 1 select_type: SIMPLE table: d type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 2996 Extra: Using where; Using join buffer 5 rows in set (0.00 sec)

If you look at the above result, you can see all of the symptoms of a bad query. But even if I wrote a better query, the results would still be the same since there are no indexes. The join type is shown as “ALL” (which is the worst), which means MySQL was unable to identify any keys that can be used in the join and hence the possible_keys and key columns are null. Most importantly, the rows column shows MySQL scans all of the records of each table for query. That means for executing the query, it will scans 7 × 110 × 122 × 326 × 2996 = 91,750,822,240 records to find the four matching results. That’s really horrible, and it will only increase exponentially as the database grows.

如果查看以上结果,则可以看到错误查询的所有症状。 但是,即使我编写了更好的查询,由于没有索引,结果仍将相同。 联接类型显示为“ ALL”(这是最差的),这意味着MySQL无法识别可以在联接中使用的任何键,因此possible_keys和key列为空。 最重要的是, rows显示MySQL扫描每个表的所有记录以进行查询。 这意味着要执行查询,它将扫描7×110×122×326×2996 = 91,750,822,240条记录以找到四个匹配的结果。 这真的很可怕,而且只会随着数据库的增长呈指数增长。

Now lets add some obvious indexes, such as primary keys for each table, and execute the query once again. As a general rule of thumb, you can look at the columns used in the JOIN clauses of the query as good candidates for keys because MySQL will always scan those columns to find matching records.

现在让我们添加一些显而易见的索引,例如每个表的主键,然后再次执行查询。 作为一般的经验法则,您可以将查询的JOIN子句中使用的列视为关键字的良好候选者,因为MySQL始终会扫描这些列以查找匹配的记录。

ALTER TABLE customers ADD PRIMARY KEY (customerNumber); ALTER TABLE employees ADD PRIMARY KEY (employeeNumber); ALTER TABLE offices ADD PRIMARY KEY (officeCode); ALTER TABLE orderdetails ADD PRIMARY KEY (orderNumber, productCode); ALTER TABLE orders ADD PRIMARY KEY (orderNumber), ADD KEY (customerNumber); ALTER TABLE payments ADD PRIMARY KEY (customerNumber, checkNumber); ALTER TABLE productlines ADD PRIMARY KEY (productLine); ALTER TABLE products ADD PRIMARY KEY (productCode), ADD KEY (buyPrice), ADD KEY (productLine); ALTER TABLE productvariants ADD PRIMARY KEY (variantId), ADD KEY (buyPrice), ADD KEY (productCode);

Let’s re-run the same query again after adding the indexes and the result should look like this:

添加索引后,让我们再次重新运行相同的查询,结果应如下所示:

********************** 1. row ********************** id: 1 select_type: SIMPLE table: o type: const possible_keys: PRIMARY,customerNumber key: PRIMARY key_len: 4 ref: const rows: 1 Extra: ********************** 2. row ********************** id: 1 select_type: SIMPLE table: c type: const possible_keys: PRIMARY key: PRIMARY key_len: 4 ref: const rows: 1 Extra: ********************** 3. row ********************** id: 1 select_type: SIMPLE table: d type: ref possible_keys: PRIMARY key: PRIMARY key_len: 4 ref: const rows: 4 Extra: ********************** 4. row ********************** id: 1 select_type: SIMPLE table: p type: eq_ref possible_keys: PRIMARY,productLine key: PRIMARY key_len: 17 ref: classicmodels.d.productCode rows: 1 Extra: ********************** 5. row ********************** id: 1 select_type: SIMPLE table: l type: eq_ref possible_keys: PRIMARY key: PRIMARY key_len: 52 ref: classicmodels.p.productLine rows: 1 Extra: 5 rows in set (0.00 sec)

After adding indexes, the number of records scanned has been brought down to 1 × 1 × 4 × 1 × 1 = 4. That means for each record with orderNumber 10101 in the orderdetails table, MySQL was able to directly find the matching record in all other tables using the indexes and didn’t have to resort to scanning the entire table.

添加索引后,已扫描的记录数已降至1×1×4×1×1 =4。这意味着对于orderdetails表中具有orderNumber 10101的每个记录,MySQL可以直接在所有记录中找到匹配的记录。其他表使用索引,而不必求助于扫描整个表。

In the first row’s output you can see the join type used is “const,” which is the fastest join type for a table with more than one record. MySQL was able to use PRIMARY key as the index. The ref column shows “const,” which is nothing but the value 10101 used in the query’s WHERE clause.

在第一行的输出中,您可以看到所使用的联接类型为“ const”,这是具有多个记录的表的最快联接类型。 MySQL能够使用PRIMARY键作为索引。 ref列显示“ const”,不过是查询的WHERE子句中使用的值10101。

Let’s take a look at one more example query. Here we’ll basically take the union of two tables, products and productvariants, each joined with productline. productvariants table consists of different product variants with productCode as reference keys and their prices.

让我们再看一个示例查询。 在这里,我们基本上将两个表( products和productvariants ,每个表都与productline结合在一起。 productvariants表包含以productCode作为参考键及其价格的不同产品变体。

EXPLAIN SELECT * FROM ( SELECT p.productName, p.productCode, p.buyPrice, l.productLine, p.status, l.status AS lineStatus FROM products p INNER JOIN productlines l ON p.productLine = l.productLine UNION SELECT v.variantName AS productName, v.productCode, p.buyPrice, l.productLine, p.status, l.status AS lineStatus FROM productvariants v INNER JOIN products p ON p.productCode = v.productCode INNER JOIN productlines l ON p.productLine = l.productLine ) products WHERE status = 'Active' AND lineStatus = 'Active' AND buyPrice BETWEEN 30 AND 50G ********************** 1. row ********************** id: 1 select_type: PRIMARY table: <derived2> type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 219 Extra: Using where ********************** 2. row ********************** id: 2 select_type: DERIVED table: p type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 110 Extra: ********************** 3. row ********************** id: 2 select_type: DERIVED table: l type: eq_ref possible_keys: PRIMARY key: PRIMARY key_len: 52 ref: classicmodels.p.productLine rows: 1 Extra: ********************** 4. row ********************** id: 3 select_type: UNION table: v type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 109 Extra: ********************** 5. row ********************** id: 3 select_type: UNION table: p type: eq_ref possible_keys: PRIMARY key: PRIMARY key_len: 17 ref: classicmodels.v.productCode rows: 1 Extra: ********************** 6. row ********************** id: 3 select_type: UNION table: l type: eq_ref possible_keys: PRIMARY key: PRIMARY key_len: 52 ref: classicmodels.p.productLine rows: 1 Extra: ********************** 7. row ********************** id: NULL select_type: UNION RESULT table: <union2,3> type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: NULL Extra: 7 rows in set (0.01 sec)

You can see a number of problem in this query. It scans all records in the products and productvariants tables. As there are no indexes on these tables for the productLine and buyPrice columns, the output’s possible_keys and key columns show null. The status of products and productlines is checked after the UNION, so moving them inside the UNION will reduce the number of records. Let’s add some additional indexes and rewrite the query.

您可以在此查询中看到许多问题。 它扫描products和productvariants表中的所有记录。 由于这些表上没有productLine和buyPrice列的索引,因此输出的possible_keys和key列显示为null。 的状态products和productlines后检查UNION ,所以移动它们里面UNION将减少的记录数。 让我们添加一些其他索引并重写查询。

CREATE INDEX idx_buyPrice ON products(buyPrice); CREATE INDEX idx_buyPrice ON productvariants(buyPrice); CREATE INDEX idx_productCode ON productvariants(productCode); CREATE INDEX idx_productLine ON products(productLine); EXPLAIN SELECT * FROM ( SELECT p.productName, p.productCode, p.buyPrice, l.productLine, p.status, l.status as lineStatus FROM products p INNER JOIN productlines AS l ON (p.productLine = l.productLine AND p.status = 'Active' AND l.status = 'Active') WHERE buyPrice BETWEEN 30 AND 50 UNION SELECT v.variantName AS productName, v.productCode, p.buyPrice, l.productLine, p.status, l.status FROM productvariants v INNER JOIN products p ON (p.productCode = v.productCode AND p.status = 'Active') INNER JOIN productlines l ON (p.productLine = l.productLine AND l.status = 'Active') WHERE v.buyPrice BETWEEN 30 AND 50 ) productG ********************** 1. row ********************** id: 1 select_type: PRIMARY table: <derived2> type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 12 Extra: ********************** 2. row ********************** id: 2 select_type: DERIVED table: p type: range possible_keys: idx_buyPrice,idx_productLine key: idx_buyPrice key_len: 8 ref: NULL rows: 23 Extra: Using where ********************** 3. row ********************** id: 2 select_type: DERIVED table: l type: eq_ref possible_keys: PRIMARY key: PRIMARY key_len: 52 ref: classicmodels.p.productLine rows: 1 Extra: Using where ********************** 4. row ********************** id: 3 select_type: UNION table: v type: range possible_keys: idx_buyPrice,idx_productCode key: idx_buyPrice key_len: 9 ref: NULL rows: 1 Extra: Using where ********************** 5. row ********************** id: 3 select_type: UNION table: p type: eq_ref possible_keys: PRIMARY,idx_productLine key: PRIMARY key_len: 17 ref: classicmodels.v.productCode rows: 1 Extra: Using where ********************** 6. row ********************** id: 3 select_type: UNION table: l type: eq_ref possible_keys: PRIMARY key: PRIMARY key_len: 52 ref: classicmodels.p.productLine rows: 1 Extra: Using where ********************** 7. row ********************** id: NULL select_type: UNION RESULT table: <union2,3> type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: NULL Extra: 7 rows in set (0.01 sec)

As you can see in the result, now the number of approximate rows scanned is significantly reduced from 2,625,810 (219 × 110 × 109) to 276 (12 × 23), which is a huge performance gain. If you try the same query, without the previous re-arrangements, simply after adding the indexes, you wouldn’t see much of a reduction. MySQL isn’t able to make use of the indexes since it has the WHERE clause in the derived result. After moving those conditions inside the UNION, it is able to make use of the indexes. This means just adding an index isn’t always enough; MySQL won’t be able to use it unless you write your queries accordingly.

正如您在结果中看到的那样,现在扫描的近似行数已从2,625,810(219×110×109)显着减少到276(12×23),这极大地提高了性能。 如果您尝试相同的查询而没有先前的重新安排,那么仅在添加索引之后,您就不会看到太多的减少。 MySQL无法使用索引,因为它在派生结果中具有WHERE子句。 在UNION移动了这些条件之后,便可以使用索引了。 这意味着仅添加索引并不总是足够的。 除非您相应地编写查询,否则MySQL将无法使用它。

摘要 (Summary)

In this article I discussed the MySQL EXPLAIN keyword, what its output means, and how you can use its output to construct better queries. In the real world, it can be more useful than the scenarios demonstrated here. More often than not, you’ll be joining a number of tables together and using complex WHERE clauses. Simply added indexes on on a few columns may not always help, and then it’s time to take a closer look at your queries themselves.

在本文中,我讨论了MySQL EXPLAIN关键字,其输出含义以及如何使用其输出来构造更好的查询。 在现实世界中,它可能比此处演示的场景更有用。 通常,您将把许多表连接在一起,并使用复杂的WHERE子句。 在几列上简单地添加索引可能并不总是有帮助,然后是时候仔细研究一下查询本身了。

Image via Efman / Shutterstock

图片来自Efman / Shutterstock

翻译自: https://www.sitepoint.com/using-explain-to-write-better-mysql-queries/

相关资源:jdk-8u281-windows-x64.exe
最新回复(0)