Saturday, 14 January 2023

Histograms

 Gather stats for column level is called histogram.

A histogram is a special type of column statistic that provides more detailed information about the data distribution in a table column. A histogram sorts values into "buckets," as you might sort coins into buckets.

The database view DBA_TAB_COL_STATISTICS or USER_TAB_COL_STATISTICS has a column called HISTOGRAM.

Types of Oracle Histograms

- HEIGHT BALANCED - Height Based : Each bucket has same number of values,skewed values occupy more buckets

- FREQUENCY - Value Based: Each key has its own

-  TOP FREQUENCY HISTOGRAMS

-  HYBRID HISTOGRAMS

- NONE - No histogram present for the column

Note: Frequency histograms are more precise than Height Balanced histograms.

Note: If you regather table stats again then histogram will re-create automatically

Please note REPEAT option will re-create existing histograms without creating new histograms.

a California-based book store ships 95% of the books to California, 4% to Oregon, and 1% to Nevada. The book orders table has 300,000 rows. A table column stores the state to which orders are shipped. A user queries the number of books shipped to Oregon. Without a histogram, the optimizer assumes an even distribution of 300000/3 (the NDV is 3), estimating cardinality at 100,000 rows. With this estimate, the optimizer chooses a full table scan. With a histogram, the optimizer calculates that 4% of the books are shipped to Oregon, and chooses an index scan.

Source : Linkedin Article

No comments:

Post a Comment