Wednesday, December 28, 2016

Straight Table with Different Expressions on Each Row

We had a requirement a few months back for a document that would show a table with different expressions on each row along with some subtotals and expressions that reference other rows and columns. It needed to be a single table that could be sent to Excel. It was obviously a requirement that grew out of existing, legacy reporting solutions using Excel but many corporations are wedded to Excel and the larger the corporation, the harder it is to find anyone who believes they have the authority to make changes or do something different.

The solution was to build a document with a straight table with a single dimension that is loaded in the loadscript with an inline load like this:

TBL_KPI_TYPE:
LOAD * INLINE [
KPI_NO, KPI_TYPE
1, Transportation:
2, Intermodal % of Miles
3, Sea %
4, LTL %
5, Avg Pallets/Truck
6,
7, Total Diesel Cost
8, Diesel $/Gallon
9, Diesel # of Gallons
10, Average distance to Customers
];

The straight table used KPI_TYPE as the dimension so those text values get listed in the leftmost column like any other dimension. The chart expression had to be slightly different for each column -- some columns were fiscal months, some fiscal quarters, one was a year-to-date column -- but they all looked something like this:
Pick(KPI_NO,
'  '.
$(v.intermodel_pcent_miles),
$(v.sea_pcent),
$(ltl_pcent),
$(avg_palletspertruck),
'    ',
$(total_diesel),
$(diesel_price),
$(diesel_tot_gallons),
$(avg_dist_to_cust)
)

This is simplified from the actual document.
  • Note that the Pick function in the expression uses KPI_NO which is defined in the inline load. The Pick function avoids “if” statements that can affect speed performance for the table as users change their selections.
  • The dimension isn’t a real data field. Each line in the expression must use the appropriate field names and set analysis to satisfy the requirement for the row. In the example above, variables are used for each row to help make the overall expression understandable at a glance.
  • Because each row is a different expression, the formatting must be done within the expression in the variable using the Num function. Some of those rows are integers, some are in thousands, some are money and some are percentages.
  • Because some of the expressions referred to other rows and columns (using the Above and RangeSum functions), I had to turn off the ability to drag and drop the columns. I also had to lock in the selection of all KPI_TYPE values so that the user wouldn’t accidentally make selections on KPI_TYPE. If the expressions were not using functions like Above then allowing the user to make selections on KPI_TYPE would make the chart more flexible.
  • Note that some of the KPI_TYPE values can be labels and the corresponding row in the expression is just a blank line. (Make sure the chart properties are not going to hide a row of all blanks.)
  • As Aaron mentions in a comment below, something similar can be achieved with a synthetic dimension using ValueList. I think ValueList is better suited to a small number of values but it offers the advantage that a user can't accidentally select one of the values by clicking on the chart; a downside is that it may consume more resources (versus an actual field dimension) when used with a large data model.

★★★

Sunday, December 4, 2016

Calendar Table and Preceding Load

Here’s an example of a calendar table built with a series of preceding load statements. The preceding load technique isn’t essential – this calendar table could have been built with a set of resident tables – but it is a good illustration of preceding load where each level of the load is a major revision of the level below.

This example code comes from a loadscript where the earlier part of the script has already loaded 50 million rows of transaction data. This calendar table part runs in a few seconds. The client for this example uses a July through June fiscal year and I left in the calculations for fiscal month, fiscal quarter and fiscal year (they use a label like FY15/16 for fiscal year).

Remember to read the comments starting at the bottom since that is how the preceding load logic works. Note the Dual function used to make sure the fiscal months sort properly from July to June in charts.

//Calendar table is for all Monday WeekStart dates for the transaction data  
TBL_CALENDAR:
LOAD
FDATE,
Month(FDATE) as CAL_MO,
Year(FDATE) as CAL_YEAR,
dual(Month(FDATE),if(num(Month(FDATE))>6,num(Month(FDATE))-6,num(Month(FDATE))+6))  AS FISCAL_MO,
dual('FY' & if(num(Month(FDATE))>6,   right(Year(FDATE),2)&'/'&right(Year(FDATE)+1,2), right(Year(FDATE)-1,2)&'/'&right(Year(FDATE),2)),
   if(num(Month(FDATE))>6, right(Year(FDATE),2),right(Year(FDATE)-1,2))) as FISCAL_YR,
Pick(Month(FDATE),'Q1','Q1','Q1', 'Q2','Q2','Q2', 'Q3','Q3','Q3', 'Q4','Q4','Q4') as QTR,
Pick(Month(FDATE),'Q3','Q3','Q3', 'Q4','Q4','Q4', 'Q1','Q1','Q1', 'Q2','Q2','Q2') as FISCAL_QTR;
//
//now, make a complete set of FDATEs between min and max
//without missing any weeks. Autogenerate keyword not needed  
Load 
Date(MINFDATE+(IterNo()*7)) as FDATE
While MINFDATE+(IterNo()*7) <= MAXFDATE;
//
//get the minimum (oldest) FDATE and maximum (newest) FDATE
Load 
Weekstart(Min(FDATE)) as MINFDATE,
Weekstart(Max(FDATE)) as MAXFDATE;
//
//First, get all of the distinct FDATE values that have been loaded 
Load date(FieldValue('FDATE', RecNo())) as FDATE 
AutoGenerate FieldValueCount('FDATE');

I didn’t use it here but a WHERE clause will work with preceding load (WHERE EXISTS… is often useful) The ORDER BY clause seems to cause an error with proceding load but you can use logic like this example where the FDATE values are built and loaded in ascending order.

★★★

Friday, November 18, 2016

MonthsBetween Function

A few days ago I needed a calculation in a loadscript that would provide the number of months between two dates. This kind of calculation comes up sometimes with supply chain planning data when the difference between the month a sales forecast is developed and the month that the sales forecast is meant to be applied, sometimes called a monthly lag, can be used to help measure forecast accuracy. QlikView provides a number of date functions but not one that can calculate the number of months between two dates.

I decided to make my own MonthsBetween date function using a document variable with input arguments. A variable like that can be used like a user defined function. I created a variable in the QlikView document and named it MonthsBetween. This is what I typed for the content of the variable (you, of course, can copy and paste from this blog post):
((Year($2)*12 + Month($2)) - (Year($1)*12 + Month($1)))

The $1 and $2 get replaced by two input arguments (which some people might call parameters) when the variable is used. The arguments must be dates. My loadscript (I was loading data from an Oracle database table) looked something like this:

BUDGET_HIST:
Load BUSINESS_UNIT,
BUDGET_AMOUNT,
CREATE_DATE,
APPLY_DATE,
$(MonthsBetween(CREATE_DATE, APPLY_DATE)) AS MONTH_LAG;
Select * from BUDGET_HIST_TABLE;

For example, if the CREATE_DATE is 1-APRIL-2016 and the APPLY_DATE is 1-JULY-2016 then MONTH_LAG is 3. If the second date is later than the first date then the result will be a positive integer number of months. If the two dates are from the same month then the result is a zero. For my data, the APPLY_DATE is always later than or equal to the CREATE_DATE so MONTH_LAG is always 0 or greater. If CREATE_DATE was later than APPLY_DATE then MONTH_LAG would be a negative number.

The calculation is nothing special -- it must have been re-created thousands of times. But, putting it into a variable makes the calculation easy to use in many places and easy to share. Test it out in a chart expression or advanced search expression

When a variable contains expression code like that and the variable might be used as part of a larger expression in a chart, then it is a good idea to enclose the entire variable contents within parentheses as I did above. It helps prevent QlikView from processing the operators in the larger expression in the wrong order.


★★★  

Wednesday, August 31, 2016

Is My KPI Getting Worse or Getting Better


We have a lot of QlikView documents that measure various kinds of business planning performance. The measures are usually a calculation of the difference between planned values and the actual values. Then the calculation is normalized by dividing it either by the planned or actual values so that the measure can be shown on a document as a percentage.

Key measures or key performance indicators (KPI) like that, if they are well chosen and well designed, give people an idea of how well a business process is performing. I believe that the measure can be even more useful if we add to it an indication of whether the measure has been getting “better” or getting “worse” over time.

Most of the performance measures we use are aggregation calculations that can be used with various dimensions like customer, product, or geographic region. When you show a measure like that in a line graph where the x-axis is time (for example, weeks or months) the line is usually jagged with up and down measures and maybe a “hump” or two. 

The first thing you can do to help someone figure out if the measure is getting better or worse is to use a trendline. In a line graph, go to the chart properties and pick the expression where you’d like to see the trend, find the trendlines section of the expression dialog and click the Linear check box. Now, click OK and you should see that QlikView has drawn a straight line across your jagged line graph. The upward or downward slope of the line can give a person some idea of whether the measure has been getting better or worse.
Note: I usually modify the Presentation tab of the chart properties and set Trendline Width to a narrow line (maybe 0.5) so it doesn’t compete with the main graph line for attention. 

If you stop reading this post at this point, then you’ve already picked up something useful.

Some users of these documents say that they would like the document to point out the areas where a particular measure is getting very bad so they know where to spend their time or resources. We can use a function already built in to QlikView to provide us with a number that would, if it was a trendline in a graph, be the slope of the line. 
If the line is going uphill from left to right, then that would be a positive slope and if it is going downhill then that would be a negative slope. A flat, level line would have a slope of zero. I’m going to describe here how to build a calculated dimension for a chart that can show the 20% of things where a measure has been deteriorating the worst over time.

First, to simplify the description, imagine that you have taken the performance measure expression from your straight table and put it into a variable. We’ll call the variable vKPI.
The time dimension field in the example is PWEEK.
And we are going to use this calculated dimension in a straight table where it will be labeled 20% of Projects Where KPI Has Deteriorated the Most
The linest function is used to calculate the slope of the imaginary graph line. 
The Rank function will help identify the 20% of worst performing PROJECTs. 
Aggr is used to create an array of PROJECT values for our chart dimension. In this example, the higher values indicate a deteriorating measure but you can flip the calculation around if your logic is reversed. 

=aggr(if((Rank(
  linest_m( 
  aggr(
  =$(vKPI)
  ,[PROJECT],PWEEK)
,PWEEK)
,1)-1) / Count(distinct total [PROJECT]) < 0.20, [PROJECT]), [PROJECT])

I know this is one of the more complex topics I’ve tackled in this blog but the result in a document that analyzes key performance indicators is valuable. In the documents where I’ve used this concept I usually provide a method for the user to choose what dimension field they would like to use, so it is not limited to PROJECT so the user could choose PRODUCT, SALESPERSON, or REGION, etc. And then for even more flexibility, I put the calculated dimension into a cycle group made up of other calculated dimensions that identify things that need attention.

In my next post, I’ll show how a related calculation could be used to color code items in a chart so that the items where performance has been getting worse are tagged with red and items where performance has been getting better are tagged with green.


  ★★★