A collegue asked my help with a performance issue. He was trying to tune a query that ran on a 6 TB table. The query had to return rows for a couple of specific statusses and for that it should only return 143,000 rows. For this a function based index was created with the following function:
DECODE(pt.STATUS, 'Awaiting', 'Awaiting','OnHold','OnHold',NULL)
But the optimizer refused to use the index. Instead it did a full table scan on another large table (400 GB), and the whole thing took hours. But with an index hint the response time was a matter of seconds.
My collegue already prepared an event 10053 trace for the query without a hint and I did one for the one with a hint. I used BeyondCompare 3 to see where differences were. This was a part of the trace of the unhinted execution plan where Oracle determined the access path of the 400 GB table:
Access Path: TableScan
Cost: 4850857.52 Resp: 4850857.52 Degree: 0
Cost_io: 4821117.00 Cost_cpu: 264001190158
Resp_io: 4821117.00 Resp_cpu: 264001190158
****** trying bitmap/domain indexes ******
****** finished trying bitmap/domain indexes ******
Best:: AccessPath: TableScan
Cost: 4850857.52 Degree: 1 Resp: 4850857.52 Card: 849590.75 Bytes: 0
And here is the same part for the hinted query:
Access Path: index (FullScan)
Index: I_INSTR_PK_INSTRKEY
resc_io: 88744279.00 resc_cpu: 1419200639011
ix_sel: 1.000000 ix_sel_with_filters: 1.000000
Cost: 88904156.20 Resp: 24695598.94 Degree: 4
Best:: AccessPath: IndexRange
Index: I_INSTR_PK_INSTRKEY
Cost: 88904156.20 Degree: 4 Resp: 24695598.94 Card: 849590.75 Bytes: 0
Well, one thing that I noticed, was that calculated cost of the function based index was for some reason about twice as high as the full table scan.
But then my collegue asked why the index was so small (100 MB). A full index scan only resulted in 143,000 rows. Concatenated with a second column it was much larger, 50 GB.
So I started thinking. NULLs are not stored in an index (unless you add a constant as a concatenated column for example). What about this function? Let’s look at the function once more:
DECODE(pt.STATUS, 'Awaiting', 'Awaiting','OnHold','OnHold',NULL)
So Awaiting and OnHold are rare occurring statusses. And this decode says: IF not rare occurring status 1 and not rare occurring status 2, THEN NULL. I’ve read the documentation and nope, these NULL values are not stored. Concatenated with another column, you can store these NULLs however.
Depending on what you want to accomplish, you need to think twice about using functions that can result in NULLs in function based indexes.