Wednesday, 11 April 2012

Beyond NVL

If there's one thing I've ever learned while developing in Oracle, there is always more than one way to solve a problem.

The question is - have you done it in the simplest, most intuitive manner?

That's the bar to set, because it may be you the revisits some code 2, 6, 12 months later...

Recently, I came across some code like this:
CASE
    WHEN NVL(custom_value, normal_value) = normal_value THEN
      NULL
    ELSE
      normal_value
    END AS alt_value
I had a fair idea what it was trying to resolve, but I wasn't sure and I still had to run some scenarios to be sure.
Basically, this was the 2nd of two fields that showed a normal value and a custom value. The first field was simply
NVL(custom_value, normal_value)

The second field should be:
If there is a custom value, show the normal value, otherwise show nothing.

This case statement does this, but more like:
If there isn't a custom value, compare the normal value with itself - if it is the same, show nothing, otherwise show the normal value.

Que?

I don't know if it's lack of familiarity with the NULL-related functions, or some people don't immediately consider them.
http://download.oracle.com/docs/cd/E11882_01/server.112/e17118/functions002.htm#CJAFHIFF

While mentally trying to convert it to something simpler, I simplified to
NULLIF(normal_value, NVL(custom_value, normal_value))

Which translates to:
If normal value is the same as itself if the custom value is not present, then use the normal value, otherwise nothing.

Then I recognised it could be simplified further, just like resolving a fraction ;-)
NVL2(custom_value, normal_value, NULL)

If custom value is not null, use normal value, otherwise show nothing.

Which reads just like my requirement - using one function, referencing each field once only.

Super.

Scott.

Added 11 April 2012 - in response to comments
with data as 
 (select 'Mouse' widget, 10 normal_value, null custom_value from dual union all
  select 'Laptop', 1000, 800 from dual union all
  select 'Mixer', 30, 30 from dual union all
  select 'Lost sock', null, 1 from dual)
select 
   widget
  ,normal_value
  ,custom_value
  ,nvl(custom_value, normal_value) first_field
  ,case when nvl(custom_value, normal_value) = normal_value then
    null
   else
    normal_value
  end as alt_value_orig
 ,nvl2(custom_value, normal_value, null) alt_value_scott
 ,case when custom_value <> normal_value then normal_value end alt_value_anon
 ,coalesce(custom_value, normal_value) alt_value_hayland
from data; 


Cheers

10 comments:

llayland said...

I believe your functions are not equivalent. Here is a counter example:

CASE
WHEN NVL(1, 1) = 1 THEN
NULL
ELSE
1
END AS alt_value

--> null

nvl2(1,1,null)

--> 1

I like to verify simplifications like this with a small plsql block consisting of a nested loops for each of the variables and a check for equality of the expressions. I usually include a dbms_output of the variables and the results

Peter said...

...and to complete the confusion there is another function, which does quite the same but better: try coalesce (http://docs.oracle.com/cd/E11882_01/server.112/e17118/functions030.htm#i1001341)

Anonymous said...

CASE WHEN custom_value<>normal_value THEN normal_value
END

Jan

llayland said...

has it.

@Peter - coalesce is like a lazy cascading nvl; it does no comparison between its arguments. I often use coalesce instead of nvl because of its laziness.

Scott Wesley said...

@llayland Correct, if the values were equal, then these functions would not return the same result (added example in post). However, there was an unmentioned assertion that those values would never be equal.

Additionally, but final comment about "referencing each field once only" - this is a reference to performance issues about the true expression behind these generic examples.

@Peter Coalesce certain has advantages, it certainly has performance benefits.
However, in this case it is not appropriate (included SQL in post above)

@anonymous (Jan) Elegant, but I think it's more an off the backboard shot, whose mechanics isn't immediately clear.
On a tangent, I have seen mutterings of potential issues with using <> over != Found this after a quick search, but I've seen further investigations in the past.

@llayland "has it" ?

What do you mean by a "lazy" function? No comparison between its arguments?

**

In my examples the 'mixer' won't occur thanks to a check constraint, nor will the 'lost sock' appear... get it?

;-)

llayland said...

"anonymous has it"

Lazy means it only evaluates its arguments if it needs to.

coalesce(null, expensive_function()) returns quickly because it does not call expensive_function

Scott Wesley said...

Never heard the word 'lazy' used to describe that ;-)

"Short circuit evaluation" perhaps - or is that too textbook?

llayland said...

depends on the textbook :)

http://en.wikipedia.org/wiki/Lazy_evaluation

If I really wanted to be technical I would say "call by name" since lazy implies sharing to avoid duplicate calculations which I do not think coalesce does.

Jeffrey Kemp said...

I've always understood Lazy Evaluation = Short-circuit Evaluation.

Also, I'm very dubious about there being any actual performance difference between <> and !=. Reading the linked posts I'd suspect a cached query plan, that gets reparsed when the operator is changed; or, perhaps a stored outline.

Scott Wesley said...

textbooks indeed!

I agree, Jeff - although I'm sure I saw a more valid reason in the past!