I am a big proponent of something I call “back to basics moment”. What I mean by that is the moment when I learn – or rather re-learn – something by reading either a help page covering a well known feature or by revisiting some of those old blogs by HIC, Qlik Luminaries or posted on AskQV. A welcome by product of these moments is when I find something completely new for me. Just last week I was reading playing with null values and I opened the section of help page titled “Value handling variables”. Apart from good old friends NullDisplay, NullInterpret and NullValue, I noticed a new friend here. It’s name is OtherSymbol and boy is it helpful. I am perplexed this variable skipped the attention of greater Qlik Community (and mine!) for I don’t even know how long. I’ve found a community post referring to this variable from 2014, so it seems it was there like forever.
OK, so what is this OtherSymbol and why am I so excited about it? Read on. OtherSymbol variable helps Qlik data architects to handle missing values in much more simple and easy way. Remember your very first Qlik training? Certainly there was a part dedicated to handling missing dimension values, because – well – these situations happen way too often, you cannot select nulls, etc.
load fact table
load dimension table
find out which IDs are in the fact table, but not in the dimension table
concatenate missing IDs to the dimension table and give them some sort of dummy label
This included creating a temporary field to check which IDs are in fact / dimension tables and using Where Exists() condition, which can be quite confusing, plus it did not perform very well with large tables. You know, doing resident load on a fact table with 10M rows unless absolutely necessary is rarely a great idea. Enter OtherSymbol. You can simply concatenate ONE line to the dimension table with an ID value equal to a symbol defined in OtherSymbol variable, and of course some sort of “unassigned” label. This will in fact concatenate ALL unassociated IDs! No more missing values with just few lines of code:
Set OtherSymbol = +; Concatenate (Dim_CustomerCategory) LOAD * Inline [ CustomerID, Customer Category +, unassigned ];
No more unnecessary resident loads and confusing where conditions. I haven’t tested it on a really large table yet, but I assume this works on symbol table level, so it will be waaaay more performant then any other technique. I draw my assumption on the fact that the record containing the other symbol value must be loaded as the very last record. If I added a new line to Dim_CustomerCategory with an ID which was previously unassociated, but I would assign it [Customer Category]=”New category”, that ID would end up connected to both “unassigned” and “New category”. Based on this I THINK that Engine does the assignment on the symbol table level at the time when it encounters the OtherSymbol value.
Do you use OtherSymbol in your scripts already? Let me know about your experience in the comments. Is this new for you? You can download an app with modified test script attached as an example: Other Symbol (dataqlues)
5 thoughts on “OtherSymbol: working with missing data (just) got a whole lot easier”
Mind. Blown. Totally passed me by also, thanks for sharing.
Thanks also for the AskQV shout-out. Can’t believe I had missed this blog from it. This has now been fixed.
Hi Steve, thanks for stopping by and thank you for including my not-so-frequently updated blog to AskQV! It is a dream come true honour for me.
T̶h̶e̶ N̶o̶r̶t̶h̶ F̶a̶c̶e̶ ???? – ????? ???? ?????????!!
Looks like there is a lot of testing ahead of me. I can see so much potential use cases for this neat solution. Thanks for sharing!
Hi Lech. Yes, there’s a lot of potential. I will certainly gradually update many of my existing apps. It’s not a big deal, but it makes script a bit clearer. Also, in some cases every second of reload time counts, right?
Whoa… Thanks for sharing this. I’ll be putting this to use on my current project, and likely going update some older apps.