Sunday, June 16, 2013

Maven's Sliding Window


Expanding on the example from last time about loading unstructured text data (to read about it, search on QlikView Maven Loading an Unstructured Text File); here’s one way to examine the data. I call it Maven’s Sliding Window. My requirement was to design a method that would allow a straight table to review the rows of text even though there may be a million rows or more. It had to work even though there may be a special selection applied (for example, a selection of all rows containing the word “program”). The straight table had to show just a very small subset of the rows at any one time to avoid the memory issue that would occur if the straight table tried to show many thousands of rows.

[If there was no selection already applied then this task is easy and no special object is required—just select a contiguous group of 20 rows and you can page though the data 20 rows at a time with the Page Up or Page Down keys on your keyboard.]

First, create a variable named vFinger with a current value of 1 and a constraint so that it contains positive integers. Now, create a Slider Object to update the value of vFinger. In the General tab set Mode to Single Value, set Min Value to =min(recno), set Max Value to =max(recno), click the static step box and put 1 in the static step box. I made the Caption tab title say “Window Pointer”. Make the slider have a vertical orientation to help make the usage a little more intuitive.

Now, set up a straight table object. Set up a calculated dimension named recno with an expression like this:
 =If(aggr(rank(-if(recno>$(vFinger),recno)),recno)<=20,recno) 

And set up one expression defined simply as  logtext  (remember that we are still working with the data loaded in the example from 6May2013)
On the Sort tab specify that the dimension recno is sorted ascending numeric.
In my simple example, I made the object title a text expression like this:
  ='Sliding Window Text from $(File_Pathname):'
so that the title uses the variable containing the file pathname to show where the data came from.

That gives us a straight table that is showing only the records surrounding the record number specified by the vFinger variable. Using the slider that controls vFinger controls which section of text appears in the straight table. You could make the “window” bigger by changing the numbers in the calculated dimension.

Add a Page Down button that updates the value of vFinger to move the window “down”. The metaphor of a sliding window means that sliding down goes toward larger recno values and sliding the window “up” goes toward the lower recno values. The button action would be Set Variable to update the vFinger variable. The action Value would be  =max(If(aggr(rank(-if(recno>$(vFinger),recno)),recno)<=20,recno)) 
That adds to the vFinger variable while staying within the recno values that are selected.
A similar Page Up button would contain a Set Variable action to update vFinger with a value like this  =min(If(aggr(rank(if(recno<$(vFinger),recno)),recno)<=20,recno)) 

If I’m working with long text records then I often set up slider objects to control which text columns appear in the straight table. For example, you could set up a slider with horizontal orientation to control the value of a variable named startcol and a slider to control a variable named endcol. Then, in the expression, instead of simply specifying logtext, code this text expression:  Mid(logtext,startcol,(endcol-startcol)+1)
That will allow you to control which text columns appear in the straight table and you can slide that part of the “window” left and right.

Note that you can save the document and later when you re-open it, the sliding window will still show the same group of records since it is driven by variable values that are saved along with the document.

My example here had a record number field to use as a dimension for the straight table. But, any data that forms a series would work just as well; for example, dates, time of day or timestamps, degree/minutes/seconds of longitude, hours of operation of a machine, measurements from the Hadron Collider, etc. 
The next blog post will discuss using the power of QlikView selection and set analysis to help us examine the text data.


★★★