Thursday, June 18, 2009

Load from Excel even if you don't know the tab name

Several months ago a reader asked if there was any way for a QlikView document to know the names of the tabs or sheets within an Excel spreadsheet without using VBA or macro code. Well, I still don't know the answer to that question. But, there is a way for a loadscript to load data from an Excel spreadsheet without knowing the tab or sheet name -- it just loads from the "first" or leftmost tab if you don't specify the table parameter.

For example, these lines in your loadscript will load from the first tab of an Excel spreadsheet named aaadata.xls:

XLTBL:
LOAD *
FROM [aaadata.xls] (biff, embedded labels);


When I have time I'll see if I can combine that with a macro that does a browse-for-file dialog and also automatically builds a table-box object and then we'll have a general purpose QlikView document that loads almost any spreadsheet in a useful way.

Make sure to read the comment from Rob Wunderlich. Thanks, Rob!

Wednesday, June 17, 2009

Test if QlikView table exists within the loadscript

Sometimes, within the loadscript, I need to check if a QlikView table exists or not. Most often this is done after bringing in a separate QlikView document's data with a BINARY statement and I don't know exactly which tables are included in the older document. Sometimes I just need to check if a QlikView table was successfully created earlier in the loadscript.

Here's the way I do it:

If NoOfRows('TABLE_1') > 0 then
  //Put statements here to be
  //used if TABLE_1 exists

Else
  //Put statements here to be
  //used if TABLE_1 does not exist

  End If



 * * *

Tuesday, June 16, 2009

Set analysis and chart dimension

Using set analysis syntax in a chart expression can be useful but remember that set analysis is based on the concept of selection -- it still must work within the dimension of the chart.
For example, if you have a document where HFcst and Lag are two fields, you might have a chart expression that looks like this:
Sum({$<Lag={2}>} HFcst)
That expression is a sum of HFcst with the set analysis syntax working like a selection override specifying that within the Sum function it should use the selection of Lag=2. You couldn't use that expression in a chart where Lag was one of the dimensions and have it work the way you might expect because the expression still must respect the dimensions of the chart -- each row of the chart must match a particular dimension value. But, as long as Lag is not used as the chart dimension then the expression could be very useful (perhaps as a comparison to the current Lag selection).

Sunday, June 14, 2009

Quick load of max field value

Sometimes, after loading a big table from several different sources I need to find the largest value of a field, or the smallest, or the average, etc. If the name of the big table is BIGTABLE and the field where I want to find the largest value is FDate then I could do something like this in the loadscript:

MAX_FDATE:
Load Max(FDate) resident BIGTABLE;

Depending on the size of BIGTABLE that could take a long time. There's a faster way - the first step is to build a table that contains all of the values of FDate. That looks like this:

ALL_FDATE:
Load FieldValue('FDate',IterNo()) as FDate
AutoGenerate(1)
While not Isnull(FieldValue('FDate',IterNo()));

That will load all of the existing distinct values of FDate almost instantaneously. It really is all of the values of FDate from all tables loaded up to that point in the loadscript and not just BIGTABLE so be aware of that if you use this method.
The second step is just to load the max FDate from our smaller ALL_FDATE table which should run much faster than loading from the resident BIG_TABLE:

MAX_FDATE:
Load Max(FDate) resident ALL_FDATE;
Drop table ALL_FDATE; //no longer needed

Saturday, March 14, 2009

Browse-for-File Macro Button

Several times I’ve created QlikView reports that must read data from a file but the name of the file can vary and must be supplied by the report user. Often this is a file that will be used by the loadscript so the user first clicks a macro button that runs a browse-for-file dialog and then the user clicks the standard Reload button.

Here’s an example of a macro that borrows the browse file dialog method from the Microsoft Excel object model. This macro should work on any PC that has MS Excel installed (it does not actually start Excel). You can copy the code from here and paste it into your QlikView report module code:

Sub Browse_for_File
'------------------
' Get the file pathname
Set oXL=CreateObject("Excel.Application")
f_name=oXL.GetOpenFilename("All Files (*.*),*.*",,"Select file",False)
If f_name="False" then
  'user cancelled out of dialog box
  Set oXL=nothing
  Exit sub
End If
'store file pathname in the file_pathname variable
ret=ActiveDocument.GetVariable("file_pathname").SetContent(f_name,false)
Set oXL=nothing
End Sub


This macro stores the file pathname for the file selected by the report user into a document variable named file_pathname (create the new document variable before running the macro). When I create a report that uses this macro I usually also put an input box in the report that shows the file pathname stored in the variable and allows the user an alternative method for specifying the pathname.

A QlikView document example of this technique, named BIG_Text_File.qvw, is available here – http://finmagic.0catch.com/ Sorry about the ads and popups – just click on the specific file download.

Monday, March 9, 2009

Finding Missing Dates With a Chart

Yesterday I was reviewing a report of sales and shipment data for 2007 and 2008. The data was very detailed with order numbers and quantities and shipdates. I wanted to know if any shipdates were missing (dates for which no shipping occurred) or if there were really shipments for every day in 2007 and 2008. This is how I looked for any missing dates:

I created a straight table chart with SHIPDATE as the dimension. I created two expressions. One expression, labeled Missing Date, looked like this:

If(Not IsNull(Above(SHIPDATE)),If(SHIPDATE-Above(SHIPDATE)<>1,Date(Above(SHIPDATE)+1)))

And the other expression, labeled Missing Date Count, looked like this:

If(Not IsNull(Above(SHIPDATE)),If(SHIPDATE-Above(SHIPDATE)<>1,(SHIPDATE-Above(SHIPDATE))-1))

The Chart Above() function lets the expression look at a value from the row above. Since the chart is sorted by SHIPDATE, the SHIPDATE value in the current row should be one day greater than the row above. If it isn’t then there is at least one missing date. The check on IsNull is required because there is no row above for the first row on the chart.

The chart worked ok. I tried it out on a sample of the data where I had intentionally excluded two dates from loading in the loadscript and it worked fine.
The total line for the Missing Date Count tells me if there are any missing dates. With this method, though, you have to scroll through down through the chart to find the specific missing dates. You can’t sort the chart to bring missing dates to the top because that would interfere with the Above() function in the expression (I disabled the interactive sort capability for this chart to make sure nobody tried changing the sorted order). The other drawback is that a gap of several consecutive missing dates only shows the one missing date that begins the gap (the Missing Date Count though shows a count of how many dates are missing in that gap).

This expression works in a chart with more dimensions too. For example, the dimensions might be ship-to-country and shipdate. Then the chart would show missing shipping dates (or dates where no shipping occurred) by ship-to-country. It only makes sense though to look for missing shipping dates within data where there is an assumption about shipping every day or every week day in order that your assumptions are tested versus the data.

I know that there are methods that could be used in the loadscript to look for missing dates and there are sql queries that can look for missing members of a sequence in the database. But, if anyone knows of a better method for finding missing dates using only QlikView sheet objects I’d be interested to know how.

Sunday, March 8, 2009

Using QlikView to Look at a Big Text File


This week I was loading data from a very large text file into a database and had some problems with some of the records. I needed to look at the group of records in the file that was causing the problem but the file was big, over 12 million records, too big for Notepad or Wordpad. I decided to load it into QlikView – not the most elegant use for QlikView but setting it up was fast and it worked out better than I hoped.

I put these lines into the loadscript:

BIGFILE:
Load RecNo() as recordno,
Len(@1) as recordlen,
@1 as RECORD
FROM E:\Data\bigtextfile.dat (ansi, txt, delimiter is '`', no labels)
where RecNo()<1000000;>

That RecNo() function in the Where clause limited the load to the first million records. I tried loading all 12 million rows but on my laptop with 2GB of memory the loaded report was pretty sluggish. I defined the backtick character as a delimiter even though it doesn’t appear in the data so I could load the entire record by calling it field @1.

The main tab on the report itself was a table box for all three fields and list boxes for each of the three fields. Selecting by recordno let me look at specific sections of the file. The list box for RECORD was very useful – all I had to do was type a few characters of the data I was looking for and hit the Enter key to select it and QlikView quickly showed me what I needed from the file.

It worked so well that I added a browse-for-the-file macro and changed the load statement to use variables that are defined with input boxes on the report to make it a useful, general purpose report for whenever I need to examine a text file. An example of this document, named BIG_Text_File.qvw, is available here – http://finmagic.0catch.com/ Sorry about the ads and popups – just click on the specific file download.

Sunday, February 22, 2009

Report Tells User ABOUT The Data

Last week a coworker, who was working on a report to validate some data sent to us by a client, asked me how to highlight null values in a chart in the report. I showed her how to do that (Click here to read it). And I offered another suggestion: you don’t want your report user to have to scroll through a lot of data or have to sort the chart to find out if any null values exist; it will help the report user if you provide a text box that tells right away if any nulls or zeros or whatever other interesting conditions exist in the data. Then, the report user can decide how to use the other objects on the report to analyze the data.

This report was a data validation report but the idea can be used for executive summaries and other kinds of reports. The goal is to have a text object that uses easy-to-understand language to tell the report user about important aspects of the data. The idea is similar to the discussion of putting counts in a chart title from a few months ago (Click here to read it). When you design a text object like this you should ask yourself: “what things will the report user be looking for in this report?” and “what questions will the report user be asking of the data?” The text box is telling the report user important things About the data not just summarizing the data.

Here’s what we tried out in the text box of a draft version of that report:

='Current Selection Data Issues:
'&
If(sum(If(Len(Trim(product))=0,1,0))>0,'Data contains null product values.
')&
If(sum(If(Len(Trim(ship_loc))=0,1,0))>0,'Data contains null ship_loc values.
')&
If(sum(If(Len(Trim(mkt_channel))=0,1,0))>0,'Data contains null mkt_channel values.
')&
If(sum(If(Len(Trim(ship_loc))>0 and Len(Trim(ship_loc))<>4,1,0))>0,'Data contains wrong length (not 4) ship_loc values.
')&
If(sum(If(qty<0,1,0))>0,'Data contains negative qty.
')&
If(sum(If(qty=0,1,0))>0,'Data contains zero qty.
')&
If(sum(If(qty<>floor(qty),1,0))>0,'Data contains fractional qty.
')&
If(sum(If(Len(Trim(qty))=0,1,0))>0,'Data contains null or blank qty.
')&
If(sum(If(qty>100000,1,0))>0,'Data contains unusually large qty.')

That text expression only shows the messages if the data conditions actually exist. If some important data condition can only be detected during during data loading then consider capturing the important condition in a document variable while the loadscript is running and then use the document variable in the text box.

We also added this little text box of various counts:

='Current Selection Contains:
'&Num(Count(Distinct product),'###,###')&' products
'&Num(Count(Distinct ship_loc),'###,###')&' ship_loc
'&Num(Count(Distinct mkt_channel),'###,###')&' mkt_channels
'&Num(Count(Distinct product&'.'&ship_loc&'.'&mkt_channel),'###,###')&' SKUs (product/ship_loc/mkt_channel)
'&Num(Count(Distinct orderid),'###,###')&' orders
'&Num(Count(Distinct shipdate),'###,###')&' shipdates, '&min(shipdate)&' to '&max(shipdate)

The data issues text box could also have been written to show counts of the various issues.

Another interesting way to use text expressions like that is in a straight table chart. You can set up the chart with a dimension like shipping-location and put the text expression (minus that first line which is a title) in the chart expressions and then you have a chart that uses easy-to-understand language to tell about data conditions for each shipping-location.

Saturday, February 21, 2009

Highlighting a Chart Value


I was asked, “How do you highlight a null value in a straight-table chart?” If you want to set the background color or text color based on a range of expression values then the Chart Properties->Visual Cues is the way to go. But the Visual Cues tab doesn’t work well for null values and it won’t help if you want to highlight the value based on anything more than a simple test of values. For that, you want to use a separate, special expression to set the background color or text color.

Right-click on the chart and select Properties and go to the Expressions tab. Click on the little plus sign ( + ) to the left of the expression you want to highlight. See the example in the picture above. In the example, the expression is named Shipments and I have clicked on the Background Color choice and entered this expression into the box:
  If(IsNull(Shipments),LightRed())

That expression tells the chart to use the LightRed color as a background color when the Shipments value is null. There are other colors you could use – search on color functions in the Help text to find information about other colors. Background color works well to highlight values in a straight table chart but it won't help with line graphs or bar charts. The expression can include other factors like the source of the data or the value in the chart dimension or the average monthly sales volume for 2008; use whatever factors that provide a reason for pointing out that particular expression value.

Thursday, February 12, 2009

Capture Variable Values in the Loadscript


I often put statements in the loadscript that record current conditions as document variables in case the information might be useful someday when checking into a problem or simply to document the origin of the data. Here’s a few that have proved useful:

let reload_time = now();
let user = OSuser();
let computer = ComputerName();
If IsPartialReload() then
   let partial_reload_time = now();
Else
   let partial_reload_time = '';
End If

Those variables can be displayed on the title tab or in a corner of the main tab.

You can capture information from the data that is being loaded and store it in a variable. This is useful for control information that would be displayed on the title page like the system information above. It can also be useful to control If statements and Loops and Where clauses in the loadscript.

Here’s an example that reads a date from an Oracle table and stores it in a QlikView document variable so it can be displayed in a chart title:

//Find OHPOST date and store in a one-row table
DUNK_OHPOST:
SELECT TO_CHAR(MAX(OHPOST),'MM/DD/YYYY') AS DUNK_OHPOST FROM SKU;

//Now, put the value into a document variable
Let DUNK_OHPOST=Date(Peek('DUNK_OHPOST',0,'DUNK_OHPOST'));
Drop table DUNK_OHPOST;

That Peek function takes 3 arguments: fieldname, row number, and table name. For a one-row table use zero to indicate the first row. If that is a unique field name then you could also use the FieldValue function to retrieve the value from the first field value.

In older QlikView versions you could use the Peek function in a text object but that isn't allowed any longer. So, store the values in document variables and you can display them in text object or chart titles. Consider displaying data values that tell the report user something important about the data; such as earliest or latest posting dates, high or low order numbers, current fiscal period, currency conversion factor, source application version number, security or classification warning, etc.

Saturday, January 31, 2009

QlikView Exists Function

One QlikView function I use often in the loadscript is the Exists function. You give it a value or expression and the name of a field and it tells you if the value already has been loaded into the field. The function returns a True or False so it can be used in the Where clause of a Load statement or in an If statement. There are a lot of uses for the function and below are three examples that I’ve used recently. For the first example, I use it near the end of the loadscript when I am loading master data to match data already loaded in previous tables. Here’s what the code might look like when loading a product description to go along with previously loaded product code values:

PRODUCT_MASTER:
Load
PRODUCT_CODE,
PRODUCT
Where Exists(PRODUCT_CODE);
SQL SELECT
PRODUCT_CODE,
PRODUCT_CODE' - 'DESCRIPTION AS PRODUCT
FROM PRODUCT_MASTER;

That piece of code will only load product descriptions for product codes that were already loaded in previous tables. Note that the Exists function is a QlikView function so it is used as part of the QlikView Load statement and not as part of the SQL statement. Only one argument for the function is used in this case since the field name and the value to be checked are the same.

A second example of a place I use the Exists function is when I load temporary tables with special field names and values that fit a particular type or grouping so I can use the values to differentiate values in other fields I’m loading. Here’s an example: I load product codes into a special field that will contain only product codes for products made with 100% recycled materials:

TEMP_RECYCLE_PRODUCTS:
Load
PRODUCT_CODE as RECYCLE_PRODUCT
Resident PRODUCT_MASTER
Where CONTENT='100 POST-CONSUMER RECYCLE';


Now, I can use that field name, RECYCLE_PRODUCT, with the Exists function as I load or process other data. This technique is especially useful when the data is coming from different sources where something like a SQL join is not available. I might use it like this when loading other data:

GREEN_SALES:
//Sales of product made from recycled materials
Load
CUSTOMER,
SALES_DATE,
QUANTITY
Resident 2009_SALES_DETAIL
Where Exists(RECYCLE_PRODUCT,PRODUCT_CODE);

Drop table TEMP_RECYCLE_PRODUCTS;

At the end of the loadscript or when it is no longer needed you should drop temporary tables so that they don't unnecessarily use up memory or create unneeded associations.

In this third example, I use the Exists function to load only the first of a set of identical values as they are read in the input data. For example, this piece of code in the loadscript will only load data for the first time an error message appears in an error message log file. The input from the database is sorted into timestamp order and the Exists function in the Load statement checks to see if the specific error message value has already been loaded:

ERROR_HISTORY:
Load
ERROR_TIMESTAMP,
ERROR_MESSAGE
Where Not Exists(ERROR_MESSAGE);
SQL SELECT
ERROR_TIMESTAMP,
ERROR_MESSAGE
FROM MESSAGES_LOG
ORDER BY ERROR_TIMESTAMP;

Saturday, January 24, 2009

Benefits of Concatenating a Code and Description

In the typical business report in a corporate environment you need to include quite a few selections either through a list box or multi-box. There may be selections for product, customer, sales executive, division, channel, product type, country, fiscal period, etc. You probably have codes for those things in your corporate databases and, if you’re lucky, you also have descriptions that go along with the codes.

When we build a new QlikView report we design the loadscript to load not just the codes but also the concatenation of the code and the corresponding description separated by a hyphen. That part of the loadscript code might look like this:
PRODUCT_MASTER:
SQL SELECT
  PRODUCT_CODE,
  PRODUCT_CODE||' - '||DESCRIPTION AS PRODUCT
FROM PRODUCT_MASTER;

Then, we use the PRODUCT field in list boxes, multi-boxes and in the chart dimensions. The reason this idea is so valuable is because of QlikView’s behavior when you are making a selection: if you click on the title bar of a list box or on the down arrow of a multi-box and then start typing, the visible list of items changes to show only the ones that match what you have typed so far.

For example, if you click the down-arrow on the multi-box selection for our example field, PRODUCT; and then start typing baby -- all of the products that contain the word “baby” in the product description move to the top of the list high-lighted with a white background. Keep typing, baby lotion… and now only the products whose names contains the words “baby lotion” are on the top of the list. At that point you can either press the Enter key which will select all of the high-lighted products or you can click on an individual product with your mouse. If that field only contained product codes then you would be limited to typing in a subset of the product code – which you can still do, of course, with the concatenated field values but the extra value of being able to type in a partial description to help with the selection is extremely useful.

A description for a code value is so useful that we have sometimes loaded the description data from a spreadsheet when it isn’t available on the corporate database. Then it is joined and concatenated to the code value in the loadscript.

Using the concatenated code and description as a dimension in the charts also works well. The field still sorts by product code since that is the leftmost part of the field value. Often we shorten the field width so that just the product code is showing on the screen but moving your cursor over the field reveals the complete description.

Thursday, January 22, 2009

Error: fetched column value was truncated

I got an error yesterday on a QlikView document that had reloaded dozens of times before. It appeared to be an Oracle SQL error. The error message that appeared in the pop-up box during loading said, SQL Error:[Oracle][ODBC][Ora]ORA-01406:fetched column value was truncated

Despite appearing like a SQL error, the piece of SQL code didn’t cause any errors when submitted through Oracle SQL*Plus. It only caused an error when used to load a table in the QlikView document. I was using QlikView version 8.50.6206.5.

With a little trial-and-error I narrowed down the source of the problem to a single database column, named QTY, which was defined in the Oracle database as a FLOAT. Since this document had reloaded successfully in the past, it must be the new data stored in the table that is causing the problem now.

In any case, I was able to make the document load and satisfy the reporting requirements by simply changing the SQL a little from SQL SELECT QTY… to SQL SELECT ROUND(QTY,2)… After this change, the document reloaded with no problem.

I’m hoping this blog entry might help someone fix the problem if it happens to them. And, inasmuch as this blog also serves as my own QlikView notes it might serve as a reminder to myself when it happens again in one of my documents.

(Note the comment describing another experience with this error)

Wednesday, January 14, 2009

Load a Table With All of the Values for a Field

An idea for developers of large reports--

Sometimes in the loadscript you need to build a table that contains all of the previously loaded values from the tables that have already been loaded. This situation comes up when a concatenated key has been used in the tables. For example, you might have a report that loads data into several tables from different sources but you’ve loaded each of the tables so that it has a key field comprised of product_code, customer_number, ship_location, and ship_date; all concatenated together but separated with an underscore character. This technique is sometimes used in large reports in order to reduce memory requirements by minimizing the number of synthetic keys that QlikView must create. Each of the large tables shares only one field, the concatenated key field, instead of sharing multiple fields and causing QlikView to build the synthetic keys.

At the bottom of the loadscript you must collect all of the concatenated key values and explode them into the individual fields so that QlikView can build the associations. If the concatenated key from our example is named CCKEY then code like this would load all of the existing CCKEY values:

ALL_CCKEY_TABLE:
Load
FieldValue('CCKEY',IterNo()) as CCKEY
AutoGenerate(1)
While not Isnull(FieldValue('CCKEY',IterNo()));

That will create a table of all CCKEY values. Now add the part that splits out the fields based on the underscore separator character:

Left join load
CCKEY,
subfield(CCKEY,'_',1) as product_code,
subfield(CCKEY,'_',2) as customer_number,
subfield(CCKEY,'_',3) as ship_location,
subfield(CCKEY,'_',4) as ship_date
Resident ALL_CCKEY_TABLE;

A table like this loads very fast since all of the data is already available in memory.

Saturday, January 10, 2009

Archive Data


We sometimes use QlikView as a method for capturing and saving data for archive purposes. These reports aren’t normally used unless someone needs to review historic data or research data to answer a question. For example, if we have a project where we might need to check on what the product master looked like six weeks ago (most companies only care about current master data and they don’t keep track of historic master data), then we might set up a daily job that loads a QlikView document with the product master and any associated data. Then, the document is simply saved in case it is needed. We usually give the document file a filename containing the date to help organize the files.

This advantage of this method over some kind of data dump file is that the method for reviewing the data (QlikView) is automatically part of the process. Anyone who needs to review the data need only click on one of the files in the archive folder. The data in QlikView qvw files is stored in compressed form so no further zipping is needed to conserve disk space.

Tuesday, January 6, 2009

Adding an Image to the Document


You can easily add visual interest to your QlikView document with a few lines or colored boxes or graphic elements. Add lines of any thickness or color by selecting from the menu Layout->New Sheet Object->Line/Arrow Object... The object properties are used to adjust the color and appearance of the line or arrow. You can drag the line around your document to get the most logical or appealing placement.

For colored boxes that stand out from the tab or sheet background insert a text object (Layout->New Sheet Object->Text Object). In the object properties choose a background color that works well with the sheet background. Drag the text object around and adjust the size with your mouse. Whether you actually have text in your text object is up to you. The text objects can be overlaid or layered with other sheet objects like charts. This is a common way to add a legend or explanation of the data that will appear alongside a chart. The Layout tab in the object properties has a Layers setting that will help with overlapping objects. The transparency setting can also help with layered objects. Don't hesitate to click the Help button for more information if you see something that you don't understand.

The background of the tab or sheet can be changed by right-clicking on the top of the sheet where the tab label is. Then select Sheet Properties and look for the controls that adjust background color or let you add an image. A background photograph is often too busy for a sheet containing chart objects but it may be perfect for a title sheet. The background and colors of a well-designed website might give you some ideas for how to design the sheets of your document.

The text object can show a diagram or picture. This is an easy way to add a corporate or department or project logo to the document. When you add the text object look at the General tab of the object properties. In the Background area click the Image button and then the Change button (btw- note how QlikView uses lines and boxes to group the logical elements of the properties window… a few lines can make the information on your document easier to understand too). Then you can select a jpg or png file from your computer and insert it into the document. Try it out! Even if you are designing a business document there's no need to be boring. Insert a cartoon image of your boss... but before the document is seen by company management you should make sure the image is respectable and that you or your company own the rights. Consider this method for adding a diagram to your document or adding photos of products or a map of locations.

Text objects can be set up so that they appear or are invisible based on the data. On the text object Layout tab you can click the Conditional button and then enter an expression that will control whether the text box is visible or not. You could make the text object visible based on the person using the report, the loaded data, time of day, or day of the week. Use it to alternate between two or more different text objects with different images. For example, in a corporate dashboard application you could make the image of a sunny day or of a stormy day appear based on the current month's revenue versus last month (or whatever passes for good news versus bad news in your business). The expression should be something that evaluates to a zero or a non-zero value. If the value is zero then the text object is not visible.