Thursday, June 18, 2009

Load from Excel even if you don't know the tab name

Several months ago a reader asked if there was any way for a QlikView document to know the names of the tabs or sheets within an Excel spreadsheet without using VBA or macro code. Well, I still don't know the answer to that question. But, there is a way for a loadscript to load data from an Excel spreadsheet without knowing the tab or sheet name -- it just loads from the "first" or leftmost tab if you don't specify the table parameter.

For example, these lines in your loadscript will load from the first tab of an Excel spreadsheet named aaadata.xls:

XLTBL:
LOAD *
FROM [aaadata.xls] (biff, embedded labels);


When I have time I'll see if I can combine that with a macro that does a browse-for-file dialog and also automatically builds a table-box object and then we'll have a general purpose QlikView document that loads almost any spreadsheet in a useful way.

Make sure to read the comment from Rob Wunderlich. Thanks, Rob!

Wednesday, June 17, 2009

Test if QlikView table exists within the loadscript

Sometimes, within the loadscript, I need to check if a QlikView table exists or not. Most often this is done after bringing in a separate QlikView document's data with a BINARY statement and I don't know exactly which tables are included in the older document. Sometimes I just need to check if a QlikView table was successfully created earlier in the loadscript.

Here's the way I do it:

If NoOfRows('TABLE_1') > 0 then
  //Put statements here to be
  //used if TABLE_1 exists

Else
  //Put statements here to be
  //used if TABLE_1 does not exist

  End If



 * * *

Tuesday, June 16, 2009

Set analysis and chart dimension

Using set analysis syntax in a chart expression can be useful but remember that set analysis is based on the concept of selection -- it still must work within the dimension of the chart.
For example, if you have a document where HFcst and Lag are two fields, you might have a chart expression that looks like this:
Sum({$<Lag={2}>} HFcst)
That expression is a sum of HFcst with the set analysis syntax working like a selection override specifying that within the Sum function it should use the selection of Lag=2. You couldn't use that expression in a chart where Lag was one of the dimensions and have it work the way you might expect because the expression still must respect the dimensions of the chart -- each row of the chart must match a particular dimension value. But, as long as Lag is not used as the chart dimension then the expression could be very useful (perhaps as a comparison to the current Lag selection).

Sunday, June 14, 2009

Quick load of max field value

Sometimes, after loading a big table from several different sources I need to find the largest value of a field, or the smallest, or the average, etc. If the name of the big table is BIGTABLE and the field where I want to find the largest value is FDate then I could do something like this in the loadscript:

MAX_FDATE:
Load Max(FDate) resident BIGTABLE;

Depending on the size of BIGTABLE that could take a long time. There's a faster way - the first step is to build a table that contains all of the values of FDate. That looks like this:

ALL_FDATE:
Load FieldValue('FDate',IterNo()) as FDate
AutoGenerate(1)
While not Isnull(FieldValue('FDate',IterNo()));

That will load all of the existing distinct values of FDate almost instantaneously. It really is all of the values of FDate from all tables loaded up to that point in the loadscript and not just BIGTABLE so be aware of that if you use this method.
The second step is just to load the max FDate from our smaller ALL_FDATE table which should run much faster than loading from the resident BIG_TABLE:

MAX_FDATE:
Load Max(FDate) resident ALL_FDATE;
Drop table ALL_FDATE; //no longer needed

Saturday, March 14, 2009

Browse-for-File Macro Button

Several times I’ve created QlikView reports that must read data from a file but the name of the file can vary and must be supplied by the report user. Often this is a file that will be used by the loadscript so the user first clicks a macro button that runs a browse-for-file dialog and then the user clicks the standard Reload button.

Here’s an example of a macro that borrows the browse file dialog method from the Microsoft Excel object model. This macro should work on any PC that has MS Excel installed (it does not actually start Excel). You can copy the code from here and paste it into your QlikView report module code:

Sub Browse_for_File
'------------------
' Get the file pathname
Set oXL=CreateObject("Excel.Application")
f_name=oXL.GetOpenFilename("All Files (*.*),*.*",,"Select file",False)
If f_name="False" then
  'user cancelled out of dialog box
  Set oXL=nothing
  Exit sub
End If
'store file pathname in the file_pathname variable
ret=ActiveDocument.GetVariable("file_pathname").SetContent(f_name,false)
Set oXL=nothing
End Sub


This macro stores the file pathname for the file selected by the report user into a document variable named file_pathname (create the new document variable before running the macro). When I create a report that uses this macro I usually also put an input box in the report that shows the file pathname stored in the variable and allows the user an alternative method for specifying the pathname.

A QlikView document example of this technique, named BIG_Text_File.qvw, is available here – http://finmagic.0catch.com/ Sorry about the ads and popups – just click on the specific file download.

Monday, March 9, 2009

Finding Missing Dates With a Chart

Yesterday I was reviewing a report of sales and shipment data for 2007 and 2008. The data was very detailed with order numbers and quantities and shipdates. I wanted to know if any shipdates were missing (dates for which no shipping occurred) or if there were really shipments for every day in 2007 and 2008. This is how I looked for any missing dates:

I created a straight table chart with SHIPDATE as the dimension. I created two expressions. One expression, labeled Missing Date, looked like this:

If(Not IsNull(Above(SHIPDATE)),If(SHIPDATE-Above(SHIPDATE)<>1,Date(Above(SHIPDATE)+1)))

And the other expression, labeled Missing Date Count, looked like this:

If(Not IsNull(Above(SHIPDATE)),If(SHIPDATE-Above(SHIPDATE)<>1,(SHIPDATE-Above(SHIPDATE))-1))

The Chart Above() function lets the expression look at a value from the row above. Since the chart is sorted by SHIPDATE, the SHIPDATE value in the current row should be one day greater than the row above. If it isn’t then there is at least one missing date. The check on IsNull is required because there is no row above for the first row on the chart.

The chart worked ok. I tried it out on a sample of the data where I had intentionally excluded two dates from loading in the loadscript and it worked fine.
The total line for the Missing Date Count tells me if there are any missing dates. With this method, though, you have to scroll through down through the chart to find the specific missing dates. You can’t sort the chart to bring missing dates to the top because that would interfere with the Above() function in the expression (I disabled the interactive sort capability for this chart to make sure nobody tried changing the sorted order). The other drawback is that a gap of several consecutive missing dates only shows the one missing date that begins the gap (the Missing Date Count though shows a count of how many dates are missing in that gap).

This expression works in a chart with more dimensions too. For example, the dimensions might be ship-to-country and shipdate. Then the chart would show missing shipping dates (or dates where no shipping occurred) by ship-to-country. It only makes sense though to look for missing shipping dates within data where there is an assumption about shipping every day or every week day in order that your assumptions are tested versus the data.

I know that there are methods that could be used in the loadscript to look for missing dates and there are sql queries that can look for missing members of a sequence in the database. But, if anyone knows of a better method for finding missing dates using only QlikView sheet objects I’d be interested to know how.

Sunday, March 8, 2009

Using QlikView to Look at a Big Text File


This week I was loading data from a very large text file into a database and had some problems with some of the records. I needed to look at the group of records in the file that was causing the problem but the file was big, over 12 million records, too big for Notepad or Wordpad. I decided to load it into QlikView – not the most elegant use for QlikView but setting it up was fast and it worked out better than I hoped.

I put these lines into the loadscript:

BIGFILE:
Load RecNo() as recordno,
Len(@1) as recordlen,
@1 as RECORD
FROM E:\Data\bigtextfile.dat (ansi, txt, delimiter is '`', no labels)
where RecNo()<1000000;>

That RecNo() function in the Where clause limited the load to the first million records. I tried loading all 12 million rows but on my laptop with 2GB of memory the loaded report was pretty sluggish. I defined the backtick character as a delimiter even though it doesn’t appear in the data so I could load the entire record by calling it field @1.

The main tab on the report itself was a table box for all three fields and list boxes for each of the three fields. Selecting by recordno let me look at specific sections of the file. The list box for RECORD was very useful – all I had to do was type a few characters of the data I was looking for and hit the Enter key to select it and QlikView quickly showed me what I needed from the file.

It worked so well that I added a browse-for-the-file macro and changed the load statement to use variables that are defined with input boxes on the report to make it a useful, general purpose report for whenever I need to examine a text file. An example of this document, named BIG_Text_File.qvw, is available here – http://finmagic.0catch.com/ Sorry about the ads and popups – just click on the specific file download.