Paul VinkenoogFirebird Null Guide
Table of Contents
- What is NULL?
- NULL in expressions
- NULL handling in UDFs
- NULL in if statements
- Testing if something is NULL
- Setting a field or variable to NULL
- Dealing with NULLs
- License notice
Time and again, support questions pop up on the Firebird mailing lists related to "strange things" happening with NULLs in Firebird SQL. The concept seems difficult to grasp - perhaps partly because of the name, which suggests a "nothing" that won't do any harm if you add it to a number or stick it to the back of a string. In reality, performing such operations will render the entire expression NULL.
This article explores the behaviour of NULL in Firebird SQL, points out common pitfalls and shows you how to safely deal with expressions that contain NULL or may resolve to NULL.
If all you want is a quick reference to refresh your memory, hop straight to the summary (which is really brief indeed).
A few sentences and examples in this guide were taken from the Firebird Quick Start Guide, first published by IBPhoenix, now part of the Firebird Project.
In SQL, NULL is not a value. It is a state indicating that an item's value is unknown or nonexistent. It is not zero or blank or an "empty string" and it does not behave like any of these values. Few things in SQL lead to more confusion than NULL, and yet its workings shouldn't be hard to understand as long as you stick to the following simple definition: NULL means unknown.
Let me repeat that:
NULL means UNKNOWN
Keep this line in mind as you read through the rest of the article, and most of the seemingly illogical results you can get with NULL will practically explain themselves.
As many of us have found out to our chagrin, NULL is contagious: use it in a numerical, string or date/time expression, and the result will always be NULL. Use it in a boolean expression, and the outcome depends on the type of operation and the other values involved.
Please note, by the way, that in Firebird versions prior to 2.0 it is mostly illegal to use the constant NULL directly in operations or comparisons. Wherever you see NULL in the expressions below, read it as "a field, variable or other expression that resolves to NULL".
Expressions returning NULL
The expressions in this list always return NULL:
- 1 + 2 + 3 + NULL
- 'Home ' || 'sweet ' || NULL
- MyField = NULL
- MyField <> NULL
- NULL = NULL
- not (NULL)
If you have difficulty understanding why, remember that NULL means "unknown". Also have a look at the following table where per-case explanations are provided. In the table we don't write NULL in the expressions (as said, this is often illegal); instead, we use two entities A and B that are both NULL. A and B may be fields, variables, or entire subexpressions in their own right - as long as they're NULL, they'll all behave the same in the enclosing expressions.
Table 1. Operations on null entities A and B
|If A and B are NULL, then:||Is:||Because:|
|1 + 2 + 3 + A||NULL||If A is unknown, then 6 + A is also unknown.|
|'Home ' || 'sweet ' || A||NULL||If A is unknown, 'Home sweet ' || A is unknown.|
|MyField = A||NULL||If A is unknown, you can't tell if MyField has the same value...|
|MyField <> A||NULL||...but you also can't tell if MyField has a different value!|
|A = B||NULL||With A and B unknown, it's impossible to know if they are equal.|
|not (A)||NULL||If A is unknown, its inverse is also unknown.|
NULL in boolean expressions
We've already seen that not(NULL) yields NULL. For the and and or operators, things are a bit more complicated:
- NULL or false = NULL
- NULL or true = true
- NULL or NULL = NULL
- NULL and false = false
- NULL and true = NULL
- NULL and NULL = NULL
Firebird SQL doesn't have a boolean data type; nor are true and false existing constants. In the left column of the explanatory table below, (true) and (false) represent subexpressions returning true/false.
Table 2. Boolean operations on null entity A
|If A is NULL, then:||Is:||Because:|
|A or (false)||NULL||"A or false" always has the same value as A - which is unknown.|
|A or (true)||true||"A or true" is always true - A's value doesn't matter.|
|A or A||NULL||"A or A" always equals A - which is NULL.|
|A and (false)||false||"A and false" is always false - A's value doesn't matter.|
|A and (true)||NULL||"A and true" always has the same value as A - which is unknown.|
|A and A||NULL||"A and A" always equals A - which is NULL.|
All these results are in accordance with boolean logic. The fact that, in order to compute "X or true" and "X and false", you simply don't need to know X's value, is also the basis of a feature we know from various programming languages: short-circuit boolean evaluation.
More logic (or not)
The short-circuit results obtained above may lead you to the following ideas:
- 0 times x equals 0 for every x. Hence, even if x's value is unknown, 0 * x is 0. (Note: this only holds if x's data type only contains numbers, not NaN or infinities.)
- The empty string is ordered lexicographically before every other string. Therefore, S >= '' is true whatever the value of S.
- Every value equals itself, whether it's unknown or not. So, although A = B justifiedly returns NULL if A and B are different NULL entities, A = A should always return true, even if A is NULL.
How is this implemented in Firebird SQL? Well, I'm sorry I have to inform you that despite this compelling logic - and the analogy with the boolean results discussed above - the following expressions all resolve to NULL:
- 0 * NULL
- NULL >= ''
- '' <= NULL
- A = A (with A a null field or variable)
So much for consistency.
NULL in aggregate functions
In aggregate functions like COUNT, SUM, AVG, MAX, and MIN, NULL is handled differently: to calculate the outcome, only non-NULL fields are taken into consideration. That is, if you have this table:
...the statement select sum(Amount) from MyTable returns 54, which is 37 + 5 + 12. Had all five fields been summed, the result would have been NULL. For AVG, the non-NULL fields are summed and the sum divided by the number of non-NULL fields.
There is one exception to this rule: COUNT(*) returns the count of all rows, even rows whose fields are all NULL. But COUNT(FieldName) behaves like the other aggregate functions in that it only counts rows where the specified field is not NULL.
Another thing worth knowing is that COUNT(*) and COUNT(FieldName) never return NULL: if there are no rows in the set, both functions return 0. Also, COUNT(FieldName) returns 0 if all FieldName fields in the set are NULL. The other aggregate functions return NULL in such cases. Be warned that even SUM returns NULL if used on an empty set, which is contrary to common logic.
UDFs (User Defined Functions) are functions that are not internal to the engine, but defined in separate modules. Firebird ships with two UDF libraries: ib_udf (inherited from InterBase) and fbudf. You can add additional libraries, e.g. by buying or downloading them, or by writing them yourself. UDFs can't be used out of the box; they have to be "declared" to the database first. This is also true for the UDFs that come with Firebird itself.
NULL <-> non-NULL conversions you didn't ask for
Teaching you how to declare, use, and write UDFs is outside the scope of this guide. However, we must warn you that UDFs can occasionally perform unexpected NULL conversions. This will sometimes result in NULL input being converted to a regular value, and other times in the nullification of valid input like '' (an empty string).
The main cause of this problem is that with "old style" UDF calling, it is not possible to pass NULL as input to the function. When a UDF like LTRIM (left trim) is called with a NULL argument, the argument is passed to the function as an empty string. From inside the function there is no way of telling if this argument represents a real empty string or a NULL. So what does the function implementor do? He has to make a choice: either take the argument at face value, or assume it was originally a NULL and treat it accordingly.
Depending on the result type, returning NULL may be possible even if receiving NULL isn't. Thus, the following unexpected things can happen:
- You call a UDF with a NULL argument. It is passed as a value, e.g. 0 or ''. Within the function, this argument is not changed back to NULL; a non-NULL result is returned.
- You call a UDF with a valid argument like 0 or ''. It is passed as-is (obviously). But the function code supposes that this value really represents a NULL, treats it as a black hole, and returns NULL to the caller.
Both conversions are usually unwanted, but the second probably more so than the first (better validate something NULL than wreck something valid). To get back to our LTRIM example: up to and including Firebird 1.0.3, this function returned NULL if you fed it an emptystring; as from version 1.5, it never returns NULL. In these recent versions, NULL strings are "trimmed" to emptystrings - which is wrong, but is considered the lesser of two evils: in the old situation, valid (empty) strings were mercilessly reduced to NULLs.
Being prepared for undesired conversions
The unsolicited conversions described above normally only happen with legacy UDFs, but there are a lot of them around (most notably in ib_udf). Also, nothing will stop a careless implementor from doing the same in a new-style function. So the bottom line is: if you use a UDF and you don't know how it behaves with respect to NULL:
- Look at its declaration to see how values are passed and returned. If it says "by descriptor", it should be safe (though it never hurts to make sure). In all other cases, walk through the rest of the steps.
- If you have the source and you can read C/C++, inspect the function code.
- Test the function both with NULL input and with input like 0 (for numerical arguments) and/or '' (for string arguments).
- If the function performs an undesired NULL <-> non-NULL conversion, you'll have to work around it in your code before calling the UDF (see also Testing if something is NULL, elsewhere in this guide).
The declarations for the shipped UDF libraries can be found in the Firebird subdirectory bin/examples (v. 1.0) or bin/UDF (v. 1.5 and up). Look at the files with extension .sql
More on UDFs
To learn more about UDFs, consult the InterBase 6.0 Developer's Guide (free at http://www.ibphoenix.com/downloads/60DevGuide.zip), Using Firebird and the Firebird Reference Guide (both on CD), or the Firebird Book. CD and book can be purchased here.
If the test expression of an if statement resolves to NULL, the then clause is skipped and the else clause (if present) executed. But beware! The expression may behave like false in this case, but it doesn't have the value false. It's still NULL, and weird things may happen if you forget that. The following examples explore some of the fiendish workings of NULL in if statements:
if (a = b) then
MyVariable = 'Equal';
MyVariable = 'Not equal';
If a and b are both NULL, MyVariable will yet be "Not equal" after executing this code. The reason is that the expression "a = b" yields NULL if at least one of them is NULL. With the test expression NULL, the then block is skipped, and the else block executed.
if (a <> b) then
MyVariable = 'Not equal';
MyVariable = 'Equal';
Here, MyVariable will be "Equal" if a is NULL and b isn't, or vice versa. The explanation is analogous to that of the previous example.
if (not (a <> b)) then
MyVariable = 'Equal';
MyVariable = 'Not equal';
This one looks like it should give the same results as the previous example, doesn't it? After all, we've inverted the test expression and swapped the then and else clauses. And indeed, as long as neither variable is NULL, both code fragments are equivalent. But as soon as a or b becomes NULL, so does the entire test expression, the else clause is executed, and the result is "Not equal".
Of course we're aware that this third example is fully equivalent to the first. We've merely included it to stress once more that not(NULL) is NULL. So, in situations where the test expression resolves to NULL, not() doesn't invert it.
In light of the havoc that NULL can wreak, you will often want to know whether something is NULL before you use it in an expression. To some, the obvious test would seem to be
if (A = NULL) then...
and indeed there are database management systems that support this syntax to determine nullness. But the SQL standard doesn't allow it, and neither does Firebird. In versions prior to 2.0 this entire syntax is even illegal. From 2.0 onward it is permitted, but the comparison always returns NULL, regardless of A's state and value. It is therefore worthless as a nullness test - what we need is a clear true or false result.
The correct way to test for NULL is:
...is null / ...is not null
These tests always return true or false - no messing around. Examples:
if (MyField is null) then...
select * from Pupils where PhoneNumber is not null
select * from Pupils where not (PhoneNumber is null)
/* does the same as the previous example */
update Numbers set Total = A + B + C where A + B + C is not null
You could say that whereas "=" (when used as an equality operator) can only compare values, "is" tests for a state.
Fields and variables can be set to NULL using the same syntax as for regular values:
insert into MyTable values (1, 'teststring', NULL, '8-May-2004')
update MyTable set MyField = null where YourField = -1
if (Number = 0) then MyVariable = null;
- "Wait a minute... and you said that MyField = NULL was illegal!"
That's right... for the comparison operator "=" (at least in pre-2.0 versions of Firebird). But here we are talking about "=" as an assignment operator. Unfortunately, both operators have the same symbol in SQL. In assignments, whether done with "=" or with an insert list, you can treat NULL just like any value - no special syntax needed (or indeed possible).
This section contains some practical tips and examples that may be of use to you in your everyday dealings with NULLs.
Testing for NULL - if it matters
Quite often, you don't need to take special measures for fields or variables that may be NULL. For instance, if you do this:
select * from Customers where Town = 'Ralston'
you probably don't want to see the customers whose town is unspecified. Likewise:
if (Age >= 18) then CanVote = 'Yes'
doesn't include people of unknown age, which is also defendable. But:
if (Age >= 18) then CanVote = 'Yes';
else CanVote = 'No';
seems less right: if you don't know a person's age, you shouldn't explicitly deny her the right to vote. Worse, this:
if (Age < 18) then CanVote = 'No';
else CanVote = 'Yes';
won't have the same effect as the previous. If some of the NULL ages are in reality under 18, you're now letting minors vote!
The right approach here is to test for NULL explicitly:
if (Age is null) then CanVote = 'Unsure';
if (Age >= 18) then CanVote = 'Yes';
else CanVote = 'No';
else always refers back to the last if in the same block. But it's often good to avoid confusion by putting begin...end keywords around a group of lines. I didn't do that here though - I wanted to keep the number of lines down. And then I made up for it by adding this note ;-)
Finding out if fields are the same
Sometimes you want to find out if two fields or variables are the same and you want to consider them equal if they are both NULL. The correct test for this is:
if (A = B or A is null and B is null) then...
or, if you want to make the precedence of the operations explicit:
if ((A = B) or (A is null and B is null)) then...
A word of warning though: if exactly one of A and B is NULL, the test expression becomes NULL, not false! This is OK in an if statement, and we can even add an else clause which will be executed if A and B are not equal (including when one is NULL and the other isn't):
if (A = B or A is null and B is null)
then ...stuff to be done if A equals B...
else ...stuff to be done if A and B are different...
But you shouldn't get the bright idea of inverting the expression and using it as an inequality test (like I once did):
/* Don't do this! */
if (not(A = B or A is null and B is null))
then ...stuff to be done if A differs from B...
The above code will work correctly if A and B are both NULL or both non-NULL. But it will fail to execute the then clause if exactly one of them is NULL.
If you only want something to be done if A and B are different, either use one of the correct expressions shown above and put a dummy statement in the then clause, or use this longer test expression:
/* This is a correct inequality test: */
if (A <> B
or A is null and B is not null
or A is not null and B is null) then...
Finding out if a field has changed
In triggers, it's often useful to know if a certain field has changed (including: gone from NULL to non-NULL or vice versa) or stayed the same. This is nothing but a special case of testing the (in)equality of two fields. Just use New.Fieldname and Old.Fieldname for A and B:
if (New.Job = Old.Job or New.Job is null and Old.Job is null)
then ...Job field has stayed the same...
else ...Job field has changed...
Substituting NULL with a value
The COALESCE function
Firebird 1.5 has a function that can convert NULL to most anything else. This enables you to do an on-the-fly conversion and use the result in your further processing, without the need for "if (MyExpression is null) then" constructs. The function is called COALESCE and you call it like this:
COALESCE(Expr1, Expr2, Expr3, ...)
COALESCE returns the first non-NULL expression in the argument list. If all the expressions are NULL, it returns NULL.
This is how you would use COALESCE to construct a person's full name from the first, middle and last names, assuming that some middle name fields may be NULL:
|| coalesce(' ' || MiddleName, '')
|| ' ' || Lastname
Or, to create an as-informal-as-possible name from a table that also includes nicknames, and assuming that both nickname and first name may be NULL:
select coalesce (Nickname, FirstName, 'Mr./Mrs.')
|| ' ' || Lastname
COALESCE will only help you out in situations where NULL can be treated in the same way as some allowed value for the datatype. If NULL needs special handling, like in the "right to vote" example used previously, your only option is to go for "if (MyExpression is null) then" after all.
Firebird 1.0: the *NVL functions
Firebird 1.0 doesn't have COALESCE. However, you can use four UDFs that provide a good part of its functionality. These UDFs reside in the fbudf lib and they are:
- iNVL, for integer arguments
- i64NVL, for bigint arguments
- dNVL, for double precision arguments
- sNVL, for strings
The *NVL functions take exactly two arguments. Like COALESCE, they return the first argument if it's not NULL; otherwise, they return the second. Please note that the Firebird 1.0 fbudf lib - and therefore, the *NVL functions - is only available for Windows.
NULL in a nutshell:
- NULL means unknown.
- If NULL figures in an expression, most of the time the entire expression becomes NULL.
- In aggregate functions only non-NULL fields are involved in the computation. Exception: COUNT(*).
- UDFs sometimes convert NULL <-> non-NULL in a seemingly random manner.
- If the test expression of an if statement is NULL, the then block is skipped and the else block executed.
- To find out if A is NULL, use "A is (not) null".
- The COALESCE and *NVL functions can convert NULL to a value.
- Assigning NULL is done like assigning values: with "A = NULL" or an insert list.
Remember, this is how NULL works in Firebird SQL. There may be (subtle) differences with other RDBMSes.
The contents of this Documentation are subject to the Public Documentation License Version 1.0 (the "License"); you may only use this Documentation if you comply with the terms of this License. Copies of the License are available at http://www.firebirdsql.org/pdfmanual/pdl.pdf (PDF) and http://www.firebirdsql.org/manual/pdl.html (HTML).
The Original Documentation is titled Firebird Null Guide.
The Initial Writer of the Original Documentation is: Paul Vinkenoog.
Copyright (C) 2005. All Rights Reserved. Initial Writer contact: paulvink at users dot sourceforge dot net.