Saturday, March 7, 2015

How Many Days Since That Document Reloaded?

I was challenged to write a simple blog posting. Here it is, although it still has some code in it.

A recent project required a document to “know” how many days ago a separate .qvw document had been reloaded. This statement in the loadscript uses the pathname to the separate document and puts the answer to “how many days ago” in the variable vDaysAgo:

LET vDaysAgo = Now() - FileTime('e:\ventures\PriceAnalysis.qvw'); 


  ★★★

  

Saturday, January 31, 2015

General File Loading Technique

Here’s a technique I first used several years ago when I started a project where we had several dozen data files, all of them containing different data, in a delimited format and each of them with a header record as the first record containing the field names. The loadscript code below shows how we could load each of the files into different QlikView tables using the same loadscript.

The code below has the filename in the code but it could be loaded into the QlikView document by including the filename as a variable in the command line execution of QlikView (for an example, search on "qlikview maven command line and automation").  Other options include building a loop and loading each of files in a folder (search on “qlikview maven loading all of the files from a folder”); or, if you are comfortable with macro module code, you could use a browse-for-file technique to make it easy to find and load a filename (search on “qlikview maven browse-for-file macro button”).

Here’s the loadscript example code:  (some of the long lines are wrapped around as you view the code in the blog window but they will go back to normal if you copy and paste the code into a text file or loadscript edit window) 

Set xfile='customer_data_12.csv'; //file pathname
Let xfile=Upper('$(xfile)'); //optional convert to upper case

FIELDDAY:  //load first record only 
Load * from $(xfile)
(txt, no labels, delimiter is '|') 
where recno()=1;  //note "no labels" 

Set loadstatement='Load '; //now, build the Load statement
For i=1 to 100
Let fn=trim(SubField(peek('@$(i)',0,'FIELDDAY'),';',1));
if '$(fn)'='' then 
  exit for
  end if;
if $(i)=1 then
  Let loadstatement='$(loadstatement) @$(i) as [$(fn)]';  
else
  Let loadstatement='$(loadstatement), @$(i) as [$(fn)]';
  end if
Next

$(xfile): //QlikView table name
$(loadstatement) 
FROM
$(xfile) 
(txt, no labels, header is 1 line, delimiter is '|'); //change delimiter character as needed

Drop table FIELDDAY;

Store $(xfile) into $(xfile).qvd (qvd); //optional save as qvd
//Drop table $(xfile);  //optional drop table

Change the file attributes as needed for your situation: for example, Excel files, skip over unneeded header records, different delimiter, etc.

There are two important ideas illustrated in the example:  first, the idea of loading the header record of a file and building a custom load statement on the spot; second, the idea that you can construct any loadscript statement or part of a statement in a variable and execute it just by putting the variable in the right place in the loadscript.

This technique helped me in a recent project where the data files were supplied with field names in the header record but all of the field names needed a little modification in order to use them. Some of them I converted to upper case and some of them needed to have special characters removed. It was easy to add the necessary text functions to edit the field names as the code assembled the load statement.

It will help you debug and validate this technique if you set the document properties option to write out a loadscript log file when the loadscript runs. The log file will show all of the lines in the loadscript as they were executed with variables fully expanded.

  ★★★

Sunday, January 11, 2015

A Little More Optimized Load

This is a continuation of last week’s blog topic (search on "QlikView Maven Optimized QVD Load With a Where Clause and Dates"). 
I had a load statement in a loadscript loading from a qvd file that (simplifying a bit) looked like this:

INVOICE_DATA:   
Load CUST_NO,
ApplyMap('custprimap',CUST_NO) as CUST_PRIORITY,
INVOICE_NO,
INVOICE_AMT
from invoice_data.qvd (qvd) 
where exists(INV_DATE); 

It was written using a mapping table feature for customer data to avoid a left join load for performance reasons. But, the load from the qvd file wasn’t an optimized load because of the ApplyMap function. The table contained many millions of rows and using an optimized load would save quite a bit of time. 
I was able to get an optimized load and still have the ApplyMap functionality by recoding that portion of the loadscript to also use a Join and a super-fast load from FieldValue like this: 

INVOICE_DATA:   
Load CUST_NO, //optimized load
INVOICE_NO,
INVOICE_AMT
from invoice_data.qvd (qvd) 
where exists(INV_DATE);

Join (INVOICE_DATA) load //preceding load
CUST_NO,
ApplyMap('custprimap',CUST_NO) as CUST_PRIORITY; 
Load FieldValue(CUST_NO,IterNo()) as CUST_NO
Autogenerate(1) while not isnull(FieldValue(CUST_NO,IterNo()));



______________ _________________ ________________

Note to Santhosh:  You can add special comments to a chart expression that will make it easier to find specific expressions with the Settings->Expression Overview, Find feature. For example, you might use an expression like this:
sum(CONTRCT_S1 * CNTRY_ADJ) //per NATO procurement 1987.36.c
and then it would be easier to find the expressions that need to be changed when procurement rules change by doing an Expression Overview Find on "NATO procurement".
Our otherwise well-intentioned friends who believe all chart expressions should be contained within variables cannot easily use this suggestion.

  ★★★

Wednesday, January 7, 2015

Optimized QVD Load With a Where Clause and Dates

If you are working with large quantities of data and especially if you are working with large data sets of historical data where you add a portion of new data to the historical data each week (an incremental load), then you will want to be working with qvd files in your loadscript. And if you are working with qvd files then you certainly want to load data from the qvd files with an optimized load. The optimized load is many times faster than loading the same data with an unoptimized load. 

A common problem is that adding a where clause to the load statement will cause the load to be unoptimized. Some types of where clauses though, can be used with an optimized load. A simple where exists type of where clause, for example, can usually be used with an optimized load.

Every incremental load process is a little different but if your process involves dates then this idea may help. This idea is for situations where you want to load data from a qvd file and you want your where clause to refer to a date within the data. Here’s an example:

Imagine that you have a date like 01/01/2014 in a variable named start_dt and another date like 12/31/2015 in a variable named end_dt and you want to load invoice data where the invoice date, inv_date, is between start_dt and end_dt. First, build a little table of all possible dates between start_dt and end_dt. Use this in the loadscript before loading any other data containing a INV_DATE field:

/* Build table of dates just for the "where exists" clause */
TMP_DATE_TABLE:
Load date(date('$(start_dt)')+IterNo()-1) as INV_DATE 
AutoGenerate(1)
While date(date('$(start_dt)')+IterNo()-1) <= date('$(end_dt)'); 

That code can build a large table of dates in your loadscript in just a few seconds. Now, you can code the load from qvd like this:

INVOICE_DATA:   //this is an optimized load
Load * from invoice_data.qvd (qvd) 
where exists(INV_DATE);
Drop table TMP_DATE_TABLE; //no longer needed

That table of dates can be varied as needed for your specific application. For example, using (IterNo()-1)*7 could give you a set of week start dates. Use whatever method makes sense to create a list of the dates that you would want to load if they exist within the qvd file data. Remember, the basic idea here is to build a table of values you would like to load from the qvd file. It is not limited to dates and not all of the values in your temp table have to actually match something in the qvd file.

____________ _________________ _________________

Note to Santhosh:  If you need to quickly find the maximum or latest invoice date from the qvd file in order to use it with date arithmetic then you can get it in a variable by doing something like this in your loadscript before loading any other tables with the INV_DATE field:

TMP_INV_DATE:  //this is an optimized load
Load INV_DATE FROM invoice_data.qvd (qvd);
TMP_MAXINV_DATE:
Load Max(INV_DATE) as maxinvdate;  //preceding load
Load FieldValue('INV_DATE',IterNo()) as INV_DATE
AutoGenerate(1)
While not Isnull(FieldValue('INV_DATE',IterNo()));
Let var_maxinvdate = Peek('maxinvdate',-1,'TMP_MAXINV_DATE');
Drop tables TMP_INV_DATE, TMP_MAXINV_DATE; //not needed

★★★



Friday, January 2, 2015

Testing the Aggr Function Used in an Expression

I’m often asked to take a look at someone's QlikView document with the explanation that “the expression doesn't work”. If it is a complex expression with multiple parts or multiple functions I suggest taking each part of the expression and testing it to make sure it is performing the expected function. When one of the parts is something simple like  sum(InvoiceAmt)  then most people get the idea that they can put that little part of the expression into a temporary chart column or text box and then, based on the current selections and/or chart dimension, check that the function is really showing the sum of the invoice amounts.

If the part of the expression that needs to be tested is an aggr function then you need to know how to check it because the aggr function does not return a single value like the sum function; it returns an array of one or more values. An array of multiple values will usually show up as a null in a chart expression or text expression. If the array happens to have only one value or if all of the values in the array are exactly the same then you will be able to see the result of the expression but its no guarantee that the function is working as intended.

In order to test the aggr function itself, I recommend surrounding the aggr function with a concat expression that will help you analyze the array of the values returned by aggr. For example, if you had a straight table chart with a dimension of geographic region, you might have an expression like this to calculate average sales rep invoice amount totals:
  Avg( Aggr( sum(InvoiceAmt), SalesRep) )

In order to view the values returned by the aggr function, you could create a temporary column and copy and paste the aggr portion of the original expression and surround it with a concat function that might look like this:
  Concat( Aggr( sum(invoiceamt), SalesRep), ' / ' )

That would create a text string showing the individual values in the array separated or delimited by ' / '
Another useful delimiter is ', ' which creates a text string showing the individual values separated by commas.

When using this technique with a chart column there might not be enough space to show all of the array values separated by the delimiter. If that happens, right-click on the chart cell you would like to examine and choose Copy to Clipboard - Cell Value (in a text box you may right-click and choose Copy to Clipboard - Text) and then paste the value into a text file or spreadsheet or other convenient place to review a long text string.

If the array of values you see using this method is what you expected then check out all of the other sections of your expression – if all of them are working as you expect then assembling them into a single expression should work in your application. I often recommend building and testing each of the parts as a development technique when building a complex expression.

One other method for examining the array of values returned by the aggr function is to use it in a temporary chart and use the aggr function as a calculated dimension. A calculated dimension is, by definition, an array of values and it may help you review the values. 

_______________ ___________________ ___________________

Note to Santhosh:   You can remove leading zeros from a numeric identifier like a SAP 18-digit material number in the loadscript with a line like this:
   replace(ltrim(replace(MATNR,'0',' ')),' ','0') as MATNR_NO_ZEROS,

If you foresee the need to do a left join load using the original identifier then keep it as a separate field. If you would like to turn this into a user-defined function like I wrote about three years ago (search on "qlikview maven user-defined function") then do this:

  1. Create a variable named DropZeros
  2. Paste this into the variable:  replace(ltrim(replace($1,'0',' ')),' ','0')
  3. Now, you have something that works like a user-defined function. It could be used in your loadscript line like this:

   $(DropZeros(MATNR)) as MATNR_NO_ZEROS,



★★★

Thursday, December 4, 2014

Function With Flexible, Choosable Dimensions

I worked on a document last week that calculated a performance indicator metric based on budget amounts compared to actual spending amounts. The chart expression for the calculation used the Aggr function and two dimensions: spending program and geographic region. Simplifying from the actual expression, it looked something like this:
Sum( Aggr( Fabs( Sum(spendqty) – Sum(budgetqty) ), SProgram, SRegion))

The business requirement had changed and I needed to provide a method so the person using the document could choose the level at which to calculate the differences and the level could involve one or two different fields and a choice of weekly or monthly time period.
I decided to use a method that would eliminate the need for if statements in the expression and also provide an easy way to add more choices in the future. First, I built an inline load of the choices in the loadscript (a data island since the field names do not appear in any other table):
CALC_LEVEL:
Load * Inline
[CALCLVL_DESC : CALCLVL_CODE
Program, Region, Weekly : SProgram, SRegion, WEEK_start_date
Program National, Weekly : SProgram, WEEK_start_date
Region, Weekly : SRegion, WEEK_start_date
Program, Region, Monthly : SProgram, SRegion, Fiscal_period445
Program National, Monthly : SProgram, Fiscal_period445
Region, Monthly : SRegion, Fiscal_period445
] (delimiter is ':');

Note that I specified a colon as the field delimiter so that I could include commas in the field values of CALCLVL_DESC.

In the document, I created a Listbox for the CALCLVL_DESC field and titled it “Please choose a calculation level:” So, the business user can see the possible choices using familiar words that make sense in a business context. The listbox was configured for Always One Selected Value. When the user makes a selection of CALCLVL_DESC it means that there is only one available value of CALCLVL_CODE which is important for making the expression work correctly.

The dimension part of the Aggr function was changed so that the expression now looks like this:
Sum( Aggr( Fabs( Sum(spendqty) – Sum(budgetqty) ), $(=CALCLVL_CODE) ))

That makes the dimension part of the Aggr function take on the specific field names listed in the CALCLVL_CODE value. It allows the user’s choice to automatically control the fields used in the dimension. Adding new choices in the future would be a matter of adding new lines to the inline load.

This is a powerful technique, it is not limited to the Aggr function or an inline load data island. It can replace an entire expression or any part of an expression based on the current selection (as you read that sentence, keep in mind how flexible and powerful the concept of selection is for QlikView).   


★★★

Tuesday, November 25, 2014

Problem: Drag and Drop Straight Table Columns Not Working

One of the other consultants at work came to me last week with a seemingly odd problem. She had a document with a straight table that originally had one dimension column. She added two more dimension columns and QlikView added the new columns on the right-hand side of the table. But, my co-worker wanted all three dimension columns to be on the left of the table which is the way we usually set up a straight table. Normally, dragging the new columns over to the left should be quick and easy… but, no matter how many times she tried, the new columns could not be dragged and stayed stubbornly on the right.

After close examination, some trial and error, and a bit of logical thinking; the source of the problem became clear. The thing that was interfering with the drag-and-drop of the dimension columns was the few columns in the chart that were not visible. In this case, there were two kinds: a hidden column, and three conditional columns that would only be visible under specific conditions.

The source of the problem suggested a solution:  right-click on the straight table, select Properties and choose the Presentation tab. One by one, click on each of the columns shown on the Presentation tab and make sure the Show Column radio button is selected for all columns (not Hidden or Conditional). Then, when all columns are visible, the drag-and-drop will work normally. After rearranging the columns as desired, use the chart Properties and Presentation tab again to restore the hidden columns and conditional columns.
One of the blog comments below suggests there is another type of non-visible column that would affect a Table Box: a column referring to a field no longer available in the document. You would have to edit properties for the Table Box and remove the field that no longer exists to restore drag-and-drop capability.