AWS announces preview of AWS Interconnect - multicloud
AWS announces AWS Interconnect – multicloud (preview), providing simple, resilient, high-speed private connections to other cloud service providers. AWS Interconnect - multicloud is easy to configure and provides high-speed, resilient connectivity with dedicated bandwidth, enabling customers to interconnect AWS networking services such as AWS Transit Gateway, AWS Cloud WAN, and Amazon VPC to other cloud service providers with ease.
如何将查询计划与 Amazon Redshift 中的查询报告关联?
我想将查询计划与 Amazon Redshift 集群中的查询报告关联。
简短描述
要确定在 Amazon Redshift 中运行查询所需的使用量,请运行 EXPLAIN 命令。EXPLAIN 命令返回的执行计划概括介绍了所涉及的查询计划和执行步骤。然后,使用 SVL_QUERY_REPORT 系统视图,查看集群切片级别的查询信息。您可以使用切片级信息,以检测集群中可能影响查询性能的不均匀数据分配。
Amazon Redshift 处理查询计划,并将该计划转换为步骤、分段和流。有关更多信息,请参阅查询计划和执行工作流程。
解决方案
创建表并获取查询的执行计划和 SVL 查询报告
如需创建表并获取执行计划和 SVL 查询报告,请完成下面的步骤:
-
在未对分配键执行联接操作的情况下运行下列查询:
select eventname, sum (pricepaid) from sales, event where sales.eventid = event.eventid group by eventname order by 2 desc;此查询会将内部表分配给所有计算节点。
-
检索查询计划:
EXPLAIN <query>; QUERY PLAN -------------------------------------------------------------------------------------------------------- XN Merge (cost=1002815368414.24..1002815368415.67 rows=571 width=27) Merge Key: sum(sales.pricepaid) -> XN Network (cost=1002815368414.24..1002815368415.67 rows=571 width=27) Send to leader -> XN Sort (cost=1002815368414.24..1002815368415.67 rows=571 width=27) Sort Key: sum(sales.pricepaid) -> XN HashAggregate (cost=2815368386.67..2815368388.10 rows=571 width=27) -> XN Hash Join DS_BCAST_INNER (cost=109.98..2815367496.05 rows=178125 width=27) Hash Cond: ("outer".eventid = "inner".eventid) -> XN Seq Scan on sales (cost=0.00..1724.56 rows=172456 width=14) -> XN Hash (cost=87.98..87.98 rows=8798 width=21) -> XN Seq Scan on event (cost=0.00..87.98 rows=8798 width=21) (12 rows) -
运行 SVL_QUERY_REPORT 查询,以获取查询报告:
select * from svl_query_report where query = query_id order by segment, step, elapsed_time, rows;**注意:**请将 query_id 替换为您的查询的 ID。
将查询计划与查询报告映射
若要将查询计划与查询报告映射,请完成下面的步骤:
- 运行下面的查询,获取分段值为 0 的查询的 svl_query_report:
下面是输出示例:select query,slice,segment,step,start_time,end_time,elapsed_time,rows,bytes,label from svl_query_report where query = 938787 and segment = 0 order by segment, step, elapsed_time, rows; EXPLAIN <query>; -> XN Hash (cost=87.98..87.98 rows=8798 width=21) -> XN Seq Scan on event (cost=0.00..87.98 rows=8798 width=21)
在前面的输出中,当分段值为 0 时,Amazon Redshift 会执行顺序扫描操作来扫描事件表。可以在标签列中找到顺序扫描操作。query | slice | segment | step | start_time | end_time | elapsed_time | rows | bytes | label -------+-------+---------+------+----------------------------+---------------------------+--------------+------+--------+------------------------------ 938787 | 0 | 0 | 0 | 2020-05-22 11:11:48.828309 | 2020-05-22 11:11:48.82987 | 1561 | 4383 | 128626 | scan tbl=278788 name=event 938787 | 1 | 0 | 0 | 2020-05-22 11:11:48.828309 | 2020-05-22 11:11:48.82987 | 1561 | 4415 | 128918 | scan tbl=278788 name=event 938787 | 0 | 0 | 1 | 2020-05-22 11:11:48.828309 | 2020-05-22 11:11:48.82987 | 1561 | 4383 | 0 | project 938787 | 1 | 0 | 1 | 2020-05-22 11:11:48.828309 | 2020-05-22 11:11:48.82987 | 1561 | 4415 | 0 | project 938787 | 0 | 0 | 2 | 2020-05-22 11:11:48.828309 | 2020-05-22 11:11:48.82987 | 1561 | 4383 | 126660 | bcast ... (6 rows) - 运行下面的查询,获取分段值为 1 的查询的 svl_query_report:
下面是输出示例:select query,slice,segment,step,start_time,end_time,elapsed_time,rows,bytes,label from svl_query_report where query = 938787 and segment = 1 order by segment, step, elapsed_time, rows;
查询继续运行,直到分段值为 1。对联接中的内部表执行哈希表操作。query | slice | segment | step | start_time | end_time | elapsed_time | rows | bytes | label -------+-------+---------+------+---------------------------+----------------------------+--------------+------+--------+------------------------------------------- 938787 | 1 | 1 | 0 | 2020-05-22 11:11:48.826864 | 2020-05-22 11:11:48.830037 | 3173 | 0 | 0 | scan tbl=376297 name=Internal Worktable 938787 | 0 | 1 | 0 | 2020-05-22 11:11:48.826864 | 2020-05-22 11:11:48.831142 | 4278 | 8798 | 253580 | scan tbl=376297 name=Internal Worktable 938787 | 1 | 1 | 1 | 2020-05-22 11:11:48.826864 | 2020-05-22 11:11:48.830037 | 3173 | 0 | 0 | project 938787 | 0 | 1 | 1 | 2020-05-22 11:11:48.826864 | 2020-05-22 11:11:48.831142 | 4278 | 8798 | 0 | project 938787 | 1 | 1 | 2 | 2020-05-22 11:11:48.826864 | 2020-05-22 11:11:48.830037 | 3173 | 0 | 0 | hash tbl=439 ... (6 rows) - 运行下面的查询,获取分段值为 2 的查询的 svl_query_report:
下面是输出示例:select query,slice,segment,step,start_time,end_time,elapsed_time,rows,bytes,label from svl_query_report where query = 938787 and segment = 2 order by segment, step, elapsed_time, rows; EXPLAIN <query>; -> XN Hash Join DS_BCAST_INNER (cost=109.98..2815367496.05 rows=178125 width=27) Hash Cond: ("outer".eventid = "inner".eventid) -> XN Seq Scan on sales (cost=0.00..1724.56 rows=172456 width=14)
在前面的示例中,当分段值为 2 时运行查询,并执行顺序扫描操作来扫描 sales 表。在同一分段中,执行聚合操作来聚合结果,然后执行哈希联接操作来联接表。其中一个表的联接列不是分配键或者排序键。结果是,内部表以 DS_BCAST_INNER 形式分配到所有计算节点。之后,可以在执行计划中看到内部表。您也可以运行此查询,以获取分段值为 3、4 和 5 的查询的 SVL_QUERY_REPORT。query | slice | segment | step | start_time | end_time | elapsed_time | rows | bytes | label --------+-------+---------+------+----------------------------+----------------------------+--------------+-------+---------+------------------------------ 938787 | 1 | 2 | 0 | 2020-05-22 11:11:48.839297 | 2020-05-22 11:11:48.865857 | 26560 | 86519 | 1730380 | scan tbl=278792 name=sales 938787 | 0 | 2 | 0 | 2020-05-22 11:11:48.838371 | 2020-05-22 11:11:48.865857 | 27486 | 85937 | 1718740 | scan tbl=278792 name=sales 938787 | 1 | 2 | 1 | 2020-05-22 11:11:48.839297 | 2020-05-22 11:11:48.865857 | 26560 | 86519 | 0 | project 938787 | 0 | 2 | 1 | 2020-05-22 11:11:48.838371 | 2020-05-22 11:11:48.865857 | 27486 | 85937 | 0 | project 938787 | 1 | 2 | 2 | 2020-05-22 11:11:48.839297 | 2020-05-22 11:11:48.865857 | 26560 | 86519 | 0 | project 938787 | 0 | 2 | 2 | 2020-05-22 11:11:48.838371 | 2020-05-22 11:11:48.865857 | 27486 | 85937 | 0 | project 938787 | 1 | 2 | 3 | 2020-05-22 11:11:48.839297 | 2020-05-22 11:11:48.865857 | 26560 | 86519 | 0 | hjoin tbl=439 938787 | 0 | 2 | 3 | 2020-05-22 11:11:48.838371 | 2020-05-22 11:11:48.865857 | 27486 | 85937 | 0 | hjoin tbl=439 938787 | 1 | 2 | 4 | 2020-05-22 11:11:48.839297 | 2020-05-22 11:11:48.865857 | 26560 | 86519 | 0 | project 938787 | 0 | 2 | 4 | 2020-05-22 11:11:48.838371 | 2020-05-22 11:11:48.865857 | 27486 | 85937 | 0 | project 938787 | 1 | 2 | 5 | 2020-05-22 11:11:48.839297 | 2020-05-22 11:11:48.865857 | 26560 | 86519 | 0 | project 938787 | 0 | 2 | 5 | 2020-05-22 11:11:48.838371 | 2020-05-22 11:11:48.865857 | 27486 | 85937 | 0 | project 938787 | 1 | 2 | 6 | 2020-05-22 11:11:48.839297 | 2020-05-22 11:11:48.865857 | 26560 | 576 | 34916 | aggr tbl=448 938787 | 0 | 2 | 6 | 2020-05-22 11:11:48.838371 | 2020-05-22 11:11:48.865857 | 27486 | 576 | 34916 | aggr tbl=448 ... (16 rows)
在这些分段中,执行哈希聚合操作和排序操作,并根据标签“aggr”和“sort”进行识别。哈希聚合操作是对未排序的分组聚合函数执行的。执行排序操作,以评估 ORDER BY 子句。
在使用完所有分段之后,查询会对分段 4 和 5 运行网络操作,以将中间结果发送至领导节点。结果将发送至领导节点进行进一步处理。您可以使用“return”标签查看结果。
查询完成后,运行以下查询,以检查查询的执行时间(以毫秒为单位):
select datediff (ms, exec_start_time, exec_end_time) from stl_wlm_query where query= 938787; date_diff ----------- 101 (1 row)
优化查询
在分析查询计划时,可以根据自己的用例调整查询性能。有关更多信息,请参阅 Top 10 performance tuning techniques for Amazon Redshift。
相关信息
- 语言
- 中文 (简体)

相关内容
AWS 官方已更新 7 个月前