千家信息网

微课sql优化(7)、统计信息收集(5)-关于直方图

发表于:2025-01-22 作者:千家信息网编辑
千家信息网最后更新 2025年01月22日,1、直方图介绍你可以为列统计直方图。这些直方图为列数据的分布情况提供准确的估算。当列数据倾斜时,直方图提供更为优化的选择性估算,为数据分布不均匀的情况提供最优的执行计划。 Oracle Databas
千家信息网最后更新 2025年01月22日微课sql优化(7)、统计信息收集(5)-关于直方图

1、直方图介绍


你可以为列统计直方图。这些直方图为列数据的分布情况提供准确的估算。当列数据倾斜时,直方图提供更为优化的选择性估算,为数据分布不均匀的情况提供最优的执行计划。 Oracle Database为提供2种类别的列统计信息直方图:
  • Height-Balanced Histograms
  • Frequency Histograms
数据库存储直方图信息*TAB_COL_STATISTICS视图(用户和DBA)。列值范围:HEIGHTBALANCED, FREQUENCY, or NONE。 2、Height-Balanced Histograms 在height-balanced histogram中,列值被划分为桶,使得每个桶包含大致相同数量的行。直方图显示端点在值范围内的位置。 考虑一个my_col值为1到100之间的列,以及一个10个桶的直方图。如果数据my_col均匀分布,则直方图看起来与图13-1类似,其中数字是端点值。例如,第七个桶具有值在60到70之间的行。 图13-1具有均匀分布的高度平衡直方图
每个桶中的行数为总行数的10%。在这个均匀分布的例子中,40%的行的值在60到100之间。 如果数据不均匀分布,则直方图可能如图13-2所示。在这种情况下,大多数行的列的值为5。只有10%的行的值在60到100之间。 图13-2具有非均匀分布的高度平衡直方图 您可以使用USER_TAB_HISTOGRAMS表格查看高度平衡的直方图,如示例13-1所示。
  1. BEGIN
  2. DBMS_STATS . GATHER_table_STATS (
  3. OWNNAME = > 'OE' ,
  4. TABNAME = > 'INVENTORIES' ,
  5. METHOD_OPT = > 'FOR COLUMNS SIZE 10 quantity_on_hand' ) ;
  6. END ;
  7. /

  8. SELECT COLUMN_NAME , NUM_DISTINCT , NUM_BUCKETS , HISTOGRAM
  9. FROM USER_TAB_COL_STATISTICS
  10. WHERE TABLE_NAME = 'INVENTORIES' AND COLUMN_NAME = 'QUANTITY_ON_HAND' ;

  11. COLUMN_NAME NUM_DISTINCT NUM_BUCKETS HISTOGRAM
  12. ------------------------------ ------------ ----------- ---------------
  13. QUANTITY_ON_HAND 237 10 HEIGHT BALANCED

  14. SELECT ENDPOINT_NUMBER , ENDPOINT_VALUE
  15. FROM USER_TAB_HISTOGRAMS
  16. WHERE TABLE_NAME = 'INVENTORIES' AND COLUMN_NAME = 'QUANTITY_ON_HAND'
  17. ORDER BY ENDPOINT_NUMBER ;

  18. ENDPOINT_NUMBER ENDPOINT_VALUE
  19. --------------- --------------
  20. 0 0
  21. 1 27
  22. 2 42
  23. 3 57
  24. 4 74
  25. 5 98
  26. 6 123
  27. 7 149
  28. 8 175
  29. 9 202
  30. 10 353
在示例13-1查询输出中,一行(1-10)对应于直方图中的每个桶。Oracle数据库向该直方图添加了特殊的第0个数据桶,因为第1个数据桶(27)中的值不是quantity_on_hand列的最小值。第0个桶的最小值为0 quantity_on_hand。

3、 frequency histogra m


frequency histogram中,列的每个值对应于直方图的单个桶。每个桶包含此单个值的出现次数。例如,假设36行包含列的值1 warehouse_id。端点值1具有端点号36。 数据库在以下条件下自动创建频率直方图,而不是高度平衡的直方图:
  • 不同值的数量小于或等于指定的直方图桶数(最多254个)。
  • 每个列值重复一次。
您可以使用USER_TAB_HISTOGRAMS视图查看频率直方图,如示例13-2所示。
  1. BEGIN
  2. DBMS_STATS . GATHER_TABLE_STATS (
  3. OWNNAME = > 'OE' ,
  4. TABNAME = > 'INVENTORIES' ,
  5. METHOD_OPT = > 'FOR COLUMNS SIZE 20 warehouse_id' ) ;
  6. END ;
  7. /

  8. SELECT COLUMN_NAME , NUM_DISTINCT , NUM_BUCKETS , HISTOGRAM
  9. FROM USER_TAB_COL_STATISTICS
  10. WHERE TABLE_NAME = 'INVENTORIES' AND COLUMN_NAME = 'WAREHOUSE_ID' ;

  11. COLUMN_NAME NUM_DISTINCT NUM_BUCKETS HISTOGRAM
  12. ------------------------------ ------------ ----------- ---------------
  13. WAREHOUSE_ID 9 9 FREQUENCY

  14. SELECT ENDPOINT_NUMBER , ENDPOINT_VALUE
  15. FROM USER_TAB_HISTOGRAMS
  16. WHERE TABLE_NAME = 'INVENTORIES' AND COLUMN_NAME = 'WAREHOUSE_ID'
  17. ORDER BY ENDPOINT_NUMBER ;

  18. ENDPOINT_NUMBER ENDPOINT_VALUE
  19. --------------- --------------
  20. 36 1
  21. 213 2
  22. 261 3
  23. 370 4
  24. 484 5
  25. 692 6
  26. 798 7
  27. 984 8
  28. 1112 9
在例13-2中,第一个桶为warehouse_id1。该值在表中显示36次,如以下查询所证实: oe@PROD> SELECT COUNT(*) FROM inventories WHERE warehouse_id = 1; COUNT(*) ---------- 36

5、 练习4 、直方图优化练习


统计已销户用户数量,请优化以下语句 select count(1) from ht.c_cons where status='close';
  1. SQL > select status , count ( 1 ) from ht . c_cons group by status ;

  2. STATUS COUNT ( 1 )
  3. ------------------------------------------------------------ ----------
  4. close 19
  5. open 9519
  6. creating 462

  7. SQL > create index ht . idx_c_cons_status on ht . c_cons ( status ) ;
  8. SQL > col owner for a10
  9. col table_name for a20
  10. col column_name for a20
  11. col data_type for a30
  12. col histogram for a20
  13. select owner , table_name , column_name , data_type ,
  14. column_id , num_distinct , histogram , NUM_NULLS , LAST_ANALYZED from
  15. dba_tab_columns where table_name = 'C_CONS' and owner = 'HT'
  16. order by column_id ; SQL > SQL > SQL > SQL > SQL > 2 3 4

  17. OWNER TABLE_NAME COLUMN_NAME DATA_TYPE COLUMN_ID NUM_DISTINCT HISTOGRAM NUM_NULLS LAST_ANALYZED
  18. ---------- -------------------- -------------------- ------------------------------ ---------- ------------ -------------------- ---------- ------------------------------
  19. HT C_CONS CONS_NO NUMBER 1 10000 NONE 0 20 - AUG - 17
  20. HT C_CONS CONS_NAME VARCHAR2 2 5057 NONE 0 20 - AUG - 17
  21. HT C_CONS ORG_NAME VARCHAR2 3 12 NONE 0 20 - AUG - 17
  22. HT C_CONS BUILD_DATE DATE 4 10000 NONE 0 20 - AUG - 17
  23. HT C_CONS STATUS VARCHAR2 5 3 NONE 0 20 - AUG - 17

  24. SQL > exec DBMS_STATS . GATHER_TABLE_STATS ( ownname = > 'HT' , tabname = > 'C_CONS' , estimate_percent = > 30 , method_opt = > 'for columns size 50 status' , no_invalidate = > FALSE , degree = > 4 , cascade = > TRUE ) ;
  25. PL / SQL procedure successfully completed .
  26. SQL > col owner for a10
  27. SQL > col table_name for a20
  28. col column_name for a20
  29. col data_type for a30
  30. col histogram for a20
  31. select owner , table_name , column_name , data_type ,
  32. column_id , num_distinct , histogram , NUM_NULLS , LAST_ANALYZED from
  33. dba_tab_columns where table_name = 'C_CONS' and owner = 'HT'
  34. order by column_id ; SQL > SQL > SQL > SQL > 2 3 4

  35. OWNER TABLE_NAME COLUMN_NAME DATA_TYPE COLUMN_ID NUM_DISTINCT HISTOGRAM NUM_NULLS LAST_ANALYZED
  36. ---------- -------------------- -------------------- ------------------------------ ---------- ------------ -------------------- ---------- ------------------------------
  37. HT C_CONS CONS_NO NUMBER 1 10000 NONE 0 20 - AUG - 17
  38. HT C_CONS CONS_NAME VARCHAR2 2 5057 NONE 0 20 - AUG - 17
  39. HT C_CONS ORG_NAME VARCHAR2 3 12 NONE 0 20 - AUG - 17
  40. HT C_CONS BUILD_DATE DATE 4 10000 NONE 0 20 - AUG - 17
  41. HT C_CONS STATUS VARCHAR2 5 3 FREQUENCY 0 20 - AUG - 17
  42. SQL > select count ( 1 ) from ht . c_cons where status = 'open' ;
  43. Execution Plan
  44. ----------------------------------------------------------
  45. Plan hash value : 2016425671

  46. -------------------------------------------------------------------------------------------
  47. | Id | Operation | Name | Rows | Bytes | Cost ( % CPU ) | Time |
  48. -------------------------------------------------------------------------------------------
  49. | 0 | SELECT STATEMENT | | 1 | 6 | 8 ( 0 ) | 00 : 00 : 01 |
  50. | 1 | SORT AGGREGATE | | 1 | 6 | | |
  51. | * 2 | INDEX FAST FULL SCAN | IDX_C_CONS_STATUS | 9639 | 57834 | 8 ( 0 ) | 00 : 00 : 01 |
  52. -------------------------------------------------------------------------------------------

  53. Predicate Information ( identified by operation id ) :
  54. ---------------------------------------------------
  55. 2 - filter ( "STATUS" = 'open' )
  56. Statistics
  57. ----------------------------------------------------------
  58. 1 recursive calls
  59. 0 db block gets
  60. 28 consistent gets
  61. 0 physical reads
  62. 0 redo size
  63. 527 bytes sent via SQL * Net to client
  64. 523 bytes received via SQL * Net from client
  65. 2 SQL * Net roundtrips to / from client
  66. 0 sorts ( memory )
  67. 0 sorts ( disk )
  68. 1 rows processed
  69. SQL >
  70. SQL > select count ( 1 ) from ht . c_cons where status = 'close' ;
  71. Execution Plan
  72. ----------------------------------------------------------

  73. Plan hash value : 2292286995

  74. ---------------------------------------------------------------------------------------
  75. | Id | Operation | Name | Rows | Bytes | Cost ( % CPU ) | Time |
  76. ---------------------------------------------------------------------------------------
  77. | 0 | SELECT STATEMENT | | 1 | 6 | 1 ( 0 ) | 00 : 00 : 01 |
  78. | 1 | SORT AGGREGATE | | 1 | 6 | | |
  79. | * 2 | INDEX RANGE SCAN | IDX_C_CONS_STATUS | 24 | 144 | 1 ( 0 ) | 00 : 00 : 01 |
  80. ---------------------------------------------------------------------------------------

  81. Predicate Information ( identified by operation id ) :
  82. ---------------------------------------------------

  83. 2 - access ( "STATUS" = 'close' )


  84. Statistics
  85. ----------------------------------------------------------
  86. 1 recursive calls
  87. 0 db block gets
  88. 2 consistent gets
  89. 0 physical reads
  90. 0 redo size
  91. 526 bytes sent via SQL * Net to client
  92. 523 bytes received via SQL * Net from client
  93. 2 SQL * Net roundtrips to / from client
  94. 0 sorts ( memory )
  95. 0 sorts ( disk )
  96. 1 rows processed
0