Oracle Histograms
Histogram
Holds data the data about values within a column in a table and number
of occurrences for specific value or range. Depend on predicate values
used in a query it stores the range of the index. Optimizer uses the
histogram to generate execution plan, it may generate a plan with Index
full table scan or Index range scan. Histograms are useful when we have
distinct data in the table. The
information about histogram is stored in data dictionary view
"dba_tab_col_statistics". METHOD_OPT parameter is used to control the
creation of histograms
creating
histogram on table depends on what columns we are gathering on table.
we can use different operations while creating histogram on tables
FOR COLUMNS SIZE AUTO: Fastest. you can specify one column instead of all columns.
FOR ALL COLUMNS SIZE REPEAT: Prevent deletion of histograms and collect it only for columns already have histograms.
FOR ALL COLUMNS:collect histograms on all columns .
FOR ALL COLUMNS SIZE SKEWONLY:collect histograms for columns have skewed value .
FOR ALL INDEXES COLUMNS:collect histograms for columns have indexes.
when we use Auto OPTION while gathering or creating histogram on the table, Oracle will track the need of histograms depend on the column usage from sys.col_usage$. using this info internally it creates buckets (histogram) on the columns and stores access info ( range is stored inside the bucket )
Exec dbms_stats.gather_table_stats
(ownname => 'UNIVERSE', tabname=>'INDIA'
estimate_percent => dbms_stats.auto_sample_size,
method_opt => 'FOR COLUMNS SIZE AUTO',
degree => 4);
For example Creating histogram for frequently used column or frequency histogram :)
If the column is repeatedly access by the application. we can create histogram on the column.Here we are creating histogram for column "state" in schema universe & table india
Exec DBMS_STATS.GATHER_TABLE_STATS
(ownname => 'UNIVERSE',
tabname => 'INDIA',
method_opt => 'FOR COLUMNS STATE',
degree => 4);
Creating Hybrid - histogram in 12c:
before creating hybrid column, check for distinct values on table.
method_opt => 'FOR COLUMNS STATE size 10', 10 buckets are created and access info ( range is stored inside the bucket
Exec DBMS_STATS.GATHER_TABLE_STATS
(ownname => 'UNIVERSE',
tabname => 'INDIA',
method_opt => 'FOR COLUMNS STATE size 10',
degree => 4);
The optimizer considers both hybrid and top frequency histograms. To qualify for a top frequency histogram, the percentage of rows occupied by the top 10 most frequent values must be equal to or greater than threshold p, where p is (1-(1/10))*100, or 90%.
Query to investigate histogram created for column "STATE"
select table_name,column_name,num_distinct, histogram from user_tab_col_statistics where table_name='INDIA' and column_name='STATE';
Limitations
Never create histogram on the primary key and have a less distinct values