Oracle的直方图试验
直方图有两种类别,频率直方图与高度均衡直方图。
默认的,如果一个倾斜列上的唯一值超过了254个,那么Oracle会对此列建立高度均衡直方图,否则建立频率直方图。
通过如下方式,建立表TAB,更新字段B,让列B产生倾斜。并在B列上创建索引。
SQL> create table tab (a number, b number);
表已创建。
SQL>
SQL> begin
2 for i in 1..10000 loop
3 insert into tab values (i, i);
4 end loop;
5 commit;
6 end;
7 /
PL/SQL 过程已成功完成。
SQL> update tab set b=5 where b between 6 and 9995;
已更新9990行。
SQL> commit;
提交完成。
SQL> create index idx_tab_b on tab(b);
索引已创建。
然后分析表,强制使列B不产生直方图。
BEGIN
DBMS_STATS.GATHER_TABLE_STATS(OWNNAME => 'SCOTT',
TABNAME => 'TAB',
CASCADE => TRUE,
METHOD_OPT => 'FOR COLUMNS B SIZE 1 ');
END;
SQL>SELECT table_name,column_name,endpoint_number,endpoint_value FROM USER_TAB_HISTOGRAMS WHERE TABLE_NAME='TAB' ;
TABLE_NAME COLUMN_NAME ENDPOINT_NUMBER ENDPOINT_VALUE
------------------------------ ---------------------------------------- --------------- --------------
TAB B 0 1
TAB B 1 10000
SQL> select * from tab where b=1;
执行计划
----------------------------------------------------------
Plan hash value: 613302217
-----------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1000 | 7000 | 4 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| TAB | 1000 | 7000 | 4 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | IDX_TAB_B | 1000 | | 2 (0)| 00:00:01 |
-----------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("B"=1)
统计信息
----------------------------------------------------------
0 recursive calls
0 db block gets
4 consistent gets
0 physical reads
0 redo size
590 bytes sent via SQL*Net to client
520 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
SQL> select * from tab where b=5;
已选择9991行。
执行计划
----------------------------------------------------------
Plan hash value: 613302217
-----------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1000 | 7000 | 4 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| TAB | 1000 | 7000 | 4 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | IDX_TAB_B | 1000 | | 2 (0)| 00:00:01 |
-----------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("B"=5)
统计信息
----------------------------------------------------------
1 recursive calls
0 db block gets
1371 consistent gets
0 physical reads
0 redo size
244153 bytes sent via SQL*Net to client
7846 bytes received via SQL*Net from client
668 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
9991 rows processed
BEGIN
DBMS_STATS.GATHER_TABLE_STATS(OWNNAME => 'SCOTT',
TABNAME => 'TAB',
CASCADE => TRUE,
METHOD_OPT => 'FOR ALL COLUMNS SIZE AUTO ');
END;
在B=1时候采用索引扫描,而B=5时候,已经采用全表扫描了,说明直方图起了作用。
SQL> select * from tab where b=1;
执行计划
----------------------------------------------------------
Plan hash value: 613302217
-----------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 7 | 2 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| TAB | 1 | 7 | 2 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | IDX_TAB_B | 1 | | 1 (0)| 00:00:01 |
-----------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("B"=1)
统计信息
----------------------------------------------------------
0 recursive calls
0 db block gets
4 consistent gets
0 physical reads
0 redo size
590 bytes sent via SQL*Net to client
520 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
SQL> select * from tab where b=5;
已选择9991行。
执行计划
----------------------------------------------------------
Plan hash value: 1995730731
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 9991 | 69937 | 7 (0)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| TAB | 9991 | 69937 | 7 (0)| 00:00:01 |
--------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("B"=5)
统计信息
----------------------------------------------------------
0 recursive calls
0 db block gets
1371 consistent gets
0 physical reads
0 redo size
244153 bytes sent via SQL*Net to client
7846 bytes received via SQL*Net from client
668 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
9991 rows processed
查看此时的直方图信息:
SQL>SELECT TABLE_NAME, COLUMN_NAME, ENDPOINT_NUMBER,ENDPOINT_VALUE FROM USER_TAB_HISTOGRAMS
WHERE TABLE_NAME = 'TAB' order by ENDPOINT_NUMBER;
TABLE_NAME COLUMN_NAME ENDPOINT_NUMBER ENDPOINT_VALUE
------------------------------ ---------------------------------------- --------------- --------------
TAB B 1 1
TAB B 2 2
TAB B 3 3
TAB B 4 4
TAB B 9995 5
TAB B 9996 9996
TAB B 9997 9997
TAB B 9998 9998
TAB B 9999 9999
TAB B 10000 10000
BEGIN
DBMS_STATS.GATHER_TABLE_STATS(OWNNAME => 'SCOTT',
TABNAME => 'TAB',
CASCADE => TRUE,
METHOD_OPT => 'FOR COLUMNS B SIZE 8 ');
END;
由于列B有10个唯一值,通过上面的SIZE 8可以强制Oracle使用高度均衡直方图。
查看直方图信息:
SQL>SELECT TABLE_NAME, COLUMN_NAME, ENDPOINT_NUMBER, ENDPOINT_VALUE FROM USER_TAB_HISTOGRAMS
WHERE TABLE_NAME = 'TAB';
TABLE_NAME COLUMN_NAME ENDPOINT_NUMBER ENDPOINT_VALUE
------------------------------ ------------------------------ --------------- --------------
TAB B 0 1
TAB B 7 5
TAB B 8 10000
TAB A 0 1
TAB A 1 10000
从查询结果惊奇的发现只有三个桶0 7 8,原来Oracle会自动省去EDNPOINT_VALUE值相同且ENDPOINT_NUMBER相邻的桶的值。
省去了桶(EDNPOINT_NUMBER)为1 2 3 4 5 6 ,EDNPOINT_VALUE为5的六条内容。
说明:在高度均衡直方图中,EDNPOINT_NUMBER代表桶号,这一点与频率直方图不同。
再看高度均衡直方图下的执行计划:
SQL> select * from tab where b=5;
已选择9991行。
已选择9991行。
执行计划
----------------------------------------------------------
Plan hash value: 1995730731
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 8750 | 61250 | 7 (0)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| TAB | 8750 | 61250 | 7 (0)| 00:00:01 |
--------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("B"=5)
统计信息
----------------------------------------------------------
1 recursive calls
0 db block gets
688 consistent gets
0 physical reads
0 redo size
212183 bytes sent via SQL*Net to client
7845 bytes received via SQL*Net from client
668 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
9991 rows processed
这时候发现执行计划的ROWS部分,Oracle计算出来的cardinality不是特别精确的。9991才是精确值。而频率直方图可以精确到9991,高度均衡直方图只精确到了8750。因此可以说频率直方图比高度均衡直方图稳定、精确。
可是现实很多时候,列的唯一值是超过254的,因此只能使用高度均衡直方图。