Finding Clues in Data with Qlik
Finding Clues in Data with Qlik

OtherSymbol: working with missing data (just) got a whole lot easier

I am a big proponent of something I call “back to basics moment”. What I mean by that is the moment when I learn – or rather re-learn – something by reading either a help page covering a well known feature or by revisiting some of those old blogs by HIC, Qlik Luminaries or posted on AskQV. A welcome by product of these moments is when I find something completely new for me. Just last week I was reading playing with null values and I opened the section of help page titled “Value handling variables”. Apart from good old friends NullDisplay, NullInterpret and NullValue, I noticed a new friend here. It’s name is OtherSymbol and boy is it helpful. I am perplexed this variable skipped the attention of greater Qlik Community (and mine!) for I don’t even know how long. I’ve found a community post referring to this variable from 2014, so it seems it was there like forever.

OK, so what is this OtherSymbol and why am I so excited about it? Read on. OtherSymbol variable helps Qlik data architects to handle missing values in much more simple and easy way. Remember your very first Qlik training? Certainly there was a part dedicated to handling missing dimension values, because – well – these situations happen way too often, you cannot select nulls, etc.

A chart with missing values
Help! There are nulls in my chart!

 

This section of the training most likely went something like this:
  1. load fact table
  2. load dimension table
  3. find out which IDs are in the fact table, but not in the dimension table
  4. concatenate missing IDs to the dimension table and give them some sort of dummy label

This included creating a temporary field to check which IDs are in fact / dimension tables and using Where Exists() condition, which can be quite confusing, plus it did not perform very well with large tables. You know, doing resident load on a fact table with 10M rows unless absolutely necessary is rarely a great idea. Enter OtherSymbol. You can simply concatenate ONE line to the dimension table with an ID value equal to a symbol defined in OtherSymbol variable, and of course some sort of “unassigned” label. This will in fact concatenate ALL unassociated IDs! No more missing values with just few lines of code:


Set OtherSymbol = +;
Concatenate (Dim_CustomerCategory)
LOAD * Inline [
CustomerID, Customer Category
+, unassigned
];

Done.

Chart after fixing missing values with OtherSymbol
A nice chart without any nulls.

No more unnecessary resident loads and confusing where conditions. I haven’t tested it on a really large table yet, but I assume this works on symbol table level, so it will be waaaay more performant then any other technique. I draw my assumption on the fact that the record containing the other symbol value must be loaded as the very last record. If I added a new line to Dim_CustomerCategory with an ID which was previously unassociated, but I would assign it [Customer Category]=”New category”, that ID would end up connected to both “unassigned” and “New category”. Based on this I THINK that Engine does the assignment on the symbol table level at the time when it encounters the OtherSymbol value.

Do you use OtherSymbol in your scripts already? Let me know about your experience in the comments. Is this new for you? You can download an app with modified test script attached as an example: Other Symbol (dataqlues)

Leave a comment

Your email address will not be published. Required fields are marked *

5 thoughts on “OtherSymbol: working with missing data (just) got a whole lot easier”