Friday, December 16, 2011

Count of Rows in a Chart






Once in a while someone asks me if there's a way to count the rows in a chart and put the number of rows in the chart title. I usually respond by asking, "why would anyone want to do that?" Usually, an expression that counts the distinct values of the fields used as the chart dimension is close enough. If you want a method that actually counts the rows in a straight table chart, here's something I have used that counts the rows and puts the count into a variable named chart_row_count. It will need to be modified for your environment and tested. (***make sure you also see the suggestion near the end)


Create a macro in the module code that looks like this:


Sub Count_Rows

'-------------

'/* Put your chart object id in the next line */

Set CHRT=ActiveDocument.GetSheetObject("CH70")

On error resume next

'/* in the next line, put an upper limit on the */

'/* loop bigger than your expected chart row count */

For i=1 to 500

ccontent=CHRT.GetCell(i,0).text

If Err.Number <> 0 then

Err.Clear

On error goto 0

ActiveDocument.Variables("chart_row_count").SetContent i-1, True

Exit For

End If

Next

On error goto 0

Set CHRT=nothing

End sub


The count in the variable includes the total line if there is one. You can use the variable value in a text expression in your chart title. One thing you must do that will vary from document to document depending on your requirements-- you must figure out how to trigger the macro when the number of rows in the chart might have changed. Triggering when a selection changes is easy to do. Triggering when a cycle group is changed will be harder. For testing, use a button to trigger the macro.


= = = = = = = = = = = = = = = = = = = = = = = = = = = = =


*** nrbhattad suggests this much simpler macro for finding row count:
set chart=ActiveDocument.GetSheetObject("CH01")

ActiveDocument.Variables("chart_row_count").SetContent chart.GetNoOfRows, true



.

Friday, November 18, 2011

Select Top 4 Values On Open






I was working on a document that contained a lot of data and the performance when the document was opened or when the user moved from tab to tab was a little slow. On my own laptop when I save documents like that I usually make a selection of a small amount of data so that later when I open the document again it will open quickly. I wanted to do something like that for the users -- some kind of selection that would happen automatically when the document is opened, but it had to be a selection of data that would be useful and easy for the users to understand.



The data in the document was organized by a date field named POSTDATE so I decided that I would like the document to automatically select the most recent four POSTDATE values when it was opened. Most users look in the document for the most recent data and anyone who wanted to review older data could easily add older dates to the selection through the multibox.




The method I used is to create a trigger that would fire when the document is opened. The trigger would do a Pareto Select. Now, I'm not saying that this is the best way to accomplish the selection and it certainly is not the only way, but it was easy and didn't involve writing macro code.




Pareto Select is used to select a percentage of values in a field based on ranking of a value or expression. It is commonly used select things like the top 20% best selling products. In this case I wanted to use it to select not a fixed percentage but the four most recent POSTDATE date values. This is what I did:



From the menu choose Settings->Document Properties and choose the Triggers tab.
In the document event triggers click the Add Action button for the OnOpen event.
Click Add Action and in the Add Action dialog choose Action Type=Selection and Action=Pareto Select then click OK.
In the Actions dialog type POSTDATE into both the Field and Selection boxes. POSTDATE is not only the field I want to select but also the expression value I want ranked.
Then, in the Percentage box enter =(4/COUNT(DISTINCT POSTDATE))*100
And click OK. Save the document and the next time it is opened it will automatically select the four most recent POSTDATE values.




A similar method could be used to do automatic selection of any number of top values of any field. Pareto Select does its selection based on the current Possible values. If you wanted to ignore the current selection and select a percentage of all values then you should add an extra action to the trigger to clear the field before making the Pareto Selection.



Testing trigger actions is most easily done by defining the same actions for a button and then you can test the action by clicking the button. When you're done testing, remove the button and make sure the actions are associated with the proper event.

= = = = = = = = = = = = = = = = = = = = = = = = = =
Author's Note 6Dec2011:
This method didn't work the way I wanted when the document already included selections on the POSTDATE field. I couldn't find any good way to deal with that using the available trigger actions. I eventually replaced the Pareto Select action with a macro action that selects the four most recent POSTDATE values. The macro code looks like this:
sub Top_4_POSTDATE
set fd=ActiveDocument.GetField("POSTDATE")
fd.Clear
fd.TopSelect "=if(aggr({1} rank(POSTDATE),POSTDATE)<=4,POSTDATE)",4
set fd=nothing
end sub

* * *

Saturday, November 12, 2011

Replace IF in Expression with Set Analysis





We often come across chart expressions that look like this:



=sum(if(SLSTYPE='DOMESTIC',SLSTOT))




It is easy to understand and if your document is a normal size and working well then leave the expression alone. But, if the document is large and you are looking for ways to speed up chart performance then consider changing the expression by replacing the "if" in the expression with the much faster set analysis syntax. Remember that set analysis is related to the concept of QlikView selections. Set analysis syntax works by specifying temporary selections within the function. So, if we wanted to consider set analysis for our example expression we should imagine whether the expression would work ok if we removed the "if" from the expression and, instead, made a selection on the SLSTYPE field to select the value DOMESTIC. If the expression would work ok in that situation then we can change the expression to use set analysis like this:



=sum({$<SLSTYPE ={'DOMESTIC'}>} SLSTOT)




We couldn't do that if the SLSTYPE field wasn't associated with SLSTOT because selecting the value DOMESTIC would have no effect on the expression. For example, if SLSTYPE had been defined in the loadscript as a Data Island unrelated to any other data then the expression wouldn't work correctly not matter how you coded the set analysis syntax within the sum function.




Another important consideration is whether or not the report is designed for users to make their own selections on the SLSTYPE field. If users are making SLSTYPE selections then they may not like an expression that overrides their selection within the expression. If the document allows user selection on the SLSTYPE field then we should define a separate field in the loadscript that will contain the exact same values as SLSTYPE and be part of the same QlikView table and same rows. Maybe call it SLSTYPE_SA and then we can code the set analysis part of our expression using the SLSTYPE_SA field and the expression will work correctly while still respecting all other user selections.




* * *

Friday, November 4, 2011

Increasing Likelyhood of Duplicate Values While Loading Data





I was working on a QlikView document last week that loaded several kinds of sales forecast data and also actual sales for a chemical manufacturer. The forecast data came from a statistical forecasting application. The charts on the document were simple and showed the forecast data rounded to integer numbers of tons.


The document contained a lot of data and the performance was a little slow when making selections so I looked for ways to improve performance. Examining the forecast data, I saw that although the quantities in the database were expressed in tons almost every quantity included a decimal fraction. Now, QlikView is very good at compressing data and especially good at compressing data with duplicate values since it stores each unique value for a field only once. This forecast data did not contain a lot of duplicates because of the decimal fractions. I saw that if I rounded the values as they were loaded then values like 1.001, 1.00032, 1.2, etc. would all be loaded as 1. After changing the loadscript to round values to integers as the data was loaded the report file size on disk was reduced by 40% and had a corresponding reduction in memory required. The charts on the report were not affected all.


This technique would not be useful in all situations. It would cause a lot of trouble if you rounded data values on some accounting documents! But, sometimes, altering data slightly as it is loaded to increase the likelyhood of duplicate values can save memory for a QlikView document. Besides rounding (not necessarily to integers, you can round to a specific number of decimal places), you can also increase the likelyhood of duplicate values by:



  • Truncating text data - keeping only the first ten characters, for example.



  • Trimming text data to remove any leading or trailing blanks.



  • Converting to all capital letters.


  • Converting non-abbreviated words to standard abbreviations, like state abbreviations in address data.



  • Removing the time stamp portion from date data



  • Converting date data to beginning-of-week or beginning of month dates.



  • Converting date data to month abbreviations if that is the level of data used in the document charts.



* * *

Sunday, October 23, 2011

Using Oracle Configured for Different Countries



We load much of the data in our QlikView documents from Oracle databases. A common query technique uses the Oracle Next_Day function to aggregate data into weekly quantities. For example,



Select Next_Day(INC_DATE,2)-7, Count(distinct INCIDENT_NUM) From INC_LOG group by NextDay(INC_DATE,2)-7;



will gather a count of security incidents by week. The Next_Day function in the example should return a Monday date. But, sometimes, when the Oracle database is configured for a European business then the way Oracle numbers days of the week will be different from a database in the U.S. A European database may give Tuesday dates for that example query.



In order to make the SQL commands in a loadscript behave like they do in the U.S. I add a single SQL line to the loadscript following the ODBC Connect string that modifies the Oracle NLS setting temporarily while data is loaded into the document. The line I add after the ODBC connect string is:



SQL ALTER SESSION SET NLS_TERRITORY='America';



Does anyone have a different way of dealing with this issue?



* * *

Friday, August 19, 2011

Looking Inside a Bookmark Without Opening It




This blog post comes from a question asked on the QlikCommunity forum. The question was how do you look at the selections inside a bookmark without actually selecting and activating the bookmark?

I had never thought about how you might do that. It is not difficult to list out the possible values for a single field based on a bookmark whether the bookmark has been opened or not. For example, for a bookmark named Feb_Sales and a field named ITEM you can put a text expression like this into a text object or chart title:

=concat({Feb_Sales} distinct ITEM, ', ')
That will give you a list of the ITEM values separated by commas. Note that it is a list of Possible values, it will only be the Selected values if the bookmark includes a selection on that particular field.

This technique can be used to give the user a peek into the selected or possible field values in the bookmarks; perhaps helping a user decide which bookmark should be used.


You can also apply the selection to the ITEM field without opening the Feb_Sales bookmark. Go to a listbox or multibox entry for ITEM and type this:

=count({Feb_Sales} 1 )


and that will select the ITEM values (if none of the ITEM values exists in your current document then it will appear that all values have been selected).


If there is a field normally associated with ITEM then you can select that. For example, if the field Product_Category is normally associated with ITEM then go to a listbox or multibox entry for Product_Category and type that same bit of text and it will automatically select the Product_Category values associated with the Feb_Sales bookmark without opening the bookmark!

* * *

Friday, July 29, 2011

A Chart Expression That Works Differently for Totals Row


I built a QlikView document last week with a straight table chart where I needed to have some of the expressions work differently on the totals row than on the regular rows. One of the dimensions on the chart was date -- but the totals row needed to aggregate data for the most recent date only. I wrote an expression that takes advantage of the fact that the RowNo function returns null for the totals row. So, leaving out the actual calculation part, the expression looked something like this:
if(IsNull(RowNo()),
/* do the totals row calculation */ ,
/* else, do the regular calculation */)


A similar idea can be used if you want to change the color or bold the totals row. For example, you could use an expression like this for background color:
if(IsNull(RowNo()),yellow())

and make the totals row look like it is emphasized with a highlighter color.

Important note: If you use the RowNo() function in your column expression you will not be able to re-sort the chart by double-clicking on a column title. If that is a problem for your application then use if(Dimensionality()=0,... as a substitute.


* * *