Showing posts with label loadscript. Show all posts
Showing posts with label loadscript. Show all posts

Friday, May 4, 2018

Loading a 1% Random Sample


I needed to load a one percent random sample of a large file last week. I’m sure others have figured out this method but this was the first time I coded it like this:
MYSAMPLE:
Load * from MYFILE.qvd (qvd)
Where Rand()<=0.01;

The Rand function returns a random value between 0 and 1. In the load statement above it would only be true 1% of the time (approximately) -- that would load only 1% of the rows. If it was coded as Where Rand()<=0.20 then it would be a 20% random sample, etc.

This method can be used with almost any Load statement even one of the steps in a preceding load. 
Note that we're talking about random numbers here so the total number of rows loaded using this technique will not be exactly 1% of the total rows in the file, although it will be close especially when the total number of rows in the file is very large. Be aware too that the rows that get loaded will not be an evenly distributed sample of rows although it will be close.

  ★★★

Sunday, May 14, 2017

Have you got a function similar to SQL NVL ?


I work with a group of people who are all experts in SQL. They all also have varying levels of technical skills and QlikView skills. One question I get frequently is whether QlikView has a function like the SQL NVL function.

For those of you not familiar with relational database SQL language, the NVL function takes two arguments: a field name or expression and a default value to be returned if the first argument is null. For example, a SQL database query may include the function like this:
  NVL(ORD_DISCOUNT, 0) AS ORD_DISCOUNT
And that would tell the SQL processor to look at the value of ORD_DISCOUNT and if it is null then return 0 as the function value otherwise return ORD_DISCOUNT.

When people ask about achieving the same thing with QlikView, I usually start by telling them that they can code an “if” statement like this:
  If(IsNull(ORD_DISCOUNT), 0, ORD_DISCOUNT)
and then I explain that there is a built-in QlikView function that can be used similar to NVL as long as the field you are checking is supposed to be numeric. It is the Alt function.
The Alt function accepts any number of arguments and looks at each of them going from left to right and returns whichever one is a valid number. So, repeating our example, if ORD_DISCOUNT is null then the following function will return a zero but if ORD_DISCOUNT is a valid number then the function returns ORD_DISCOUNT:
  Alt(ORD_DISCOUNT, 0)   

The Alt function will treat the last or rightmost argument as an “else” condition and return that value if none of the preceding arguments are numeric. The rightmost value need not be numeric so you could code something like this:
  Alt(ORD_DISCOUNT, 'Discount is missing')

The Alt function may be used in the loadscript code or in chart expressions.
If you are interested, the QlikView Help (search in Help for Conditional Functions) shows an interesting example for how the Alt function can be used to identify a date when the date value may be any one of several different date formats.  


  ★★★

Wednesday, April 5, 2017

Converting Edited Number Text into a Numeric Field

My co-worker, Naveen, had a requirement for a document that would load some financial data from a file that a client had sent to us. The client had created the file with a “screen scraping” type of application that captured the number fields as edited numeric text. For example, a field might contain  ($3,046.10)   So, that example includes a dollar sign, comma as a thousands separator, period as a decimal point, and parenthesis to indicate a negative number or credit.

Naveen needed to load the data as a number. Here’s how it was done:
We used two functions. A Num# function converted the text string into a dual field containing both a text portion and a numeric portion. Then, an outer Num function extracted just the numeric portion. For example, one of the number fields was named AUG. In the loadscript, the line that converted the edited number into a simple numeric field looked like this:
Num(Num#(AUG, '$#,##0;($#,##0)'))

The edit string or format code, the part within the single quotes, can vary depending on your requirement. The edit string I used in the example is a good one for the kind of editing you might find in a financial spreadsheet.

Here's a few other examples of edited number text and how it looks after using this expression:
AUG
Num(Num#(AUGUST,'$#,##0;($#,##0)'))
$0
0
$1
1
$1,251
1251
$1,465
1465
$1,466
1466
$3,724
3724
$4,158
4158
($11)
-11
($132)
-132
($2)
-2
($4)
-4
What now?
         (this is a null)
19
19
25.12
25.12
25.1234
25.1234
25.3
25.3
2,070
2070


  ★★★

Wednesday, December 28, 2016

Straight Table with Different Expressions on Each Row

We had a requirement a few months back for a document that would show a table with different expressions on each row along with some subtotals and expressions that reference other rows and columns. It needed to be a single table that could be sent to Excel. It was obviously a requirement that grew out of existing, legacy reporting solutions using Excel but many corporations are wedded to Excel and the larger the corporation, the harder it is to find anyone who believes they have the authority to make changes or do something different.

The solution was to build a document with a straight table with a single dimension that is loaded in the loadscript with an inline load like this:

TBL_KPI_TYPE:
LOAD * INLINE [
KPI_NO, KPI_TYPE
1, Transportation:
2, Intermodal % of Miles
3, Sea %
4, LTL %
5, Avg Pallets/Truck
6,
7, Total Diesel Cost
8, Diesel $/Gallon
9, Diesel # of Gallons
10, Average distance to Customers
];

The straight table used KPI_TYPE as the dimension so those text values get listed in the leftmost column like any other dimension. The chart expression had to be slightly different for each column -- some columns were fiscal months, some fiscal quarters, one was a year-to-date column -- but they all looked something like this:
Pick(KPI_NO,
'  '.
$(v.intermodel_pcent_miles),
$(v.sea_pcent),
$(ltl_pcent),
$(avg_palletspertruck),
'    ',
$(total_diesel),
$(diesel_price),
$(diesel_tot_gallons),
$(avg_dist_to_cust)
)

This is simplified from the actual document.
  • Note that the Pick function in the expression uses KPI_NO which is defined in the inline load. The Pick function avoids “if” statements that can affect speed performance for the table as users change their selections.
  • The dimension isn’t a real data field. Each line in the expression must use the appropriate field names and set analysis to satisfy the requirement for the row. In the example above, variables are used for each row to help make the overall expression understandable at a glance.
  • Because each row is a different expression, the formatting must be done within the expression in the variable using the Num function. Some of those rows are integers, some are in thousands, some are money and some are percentages.
  • Because some of the expressions referred to other rows and columns (using the Above and RangeSum functions), I had to turn off the ability to drag and drop the columns. I also had to lock in the selection of all KPI_TYPE values so that the user wouldn’t accidentally make selections on KPI_TYPE. If the expressions were not using functions like Above then allowing the user to make selections on KPI_TYPE would make the chart more flexible.
  • Note that some of the KPI_TYPE values can be labels and the corresponding row in the expression is just a blank line. (Make sure the chart properties are not going to hide a row of all blanks.)
  • As Aaron mentions in a comment below, something similar can be achieved with a synthetic dimension using ValueList. I think ValueList is better suited to a small number of values but it offers the advantage that a user can't accidentally select one of the values by clicking on the chart; a downside is that it may consume more resources (versus an actual field dimension) when used with a large data model.

★★★

Sunday, December 4, 2016

Calendar Table and Preceding Load

Here’s an example of a calendar table built with a series of preceding load statements. The preceding load technique isn’t essential – this calendar table could have been built with a set of resident tables – but it is a good illustration of preceding load where each level of the load is a major revision of the level below.

This example code comes from a loadscript where the earlier part of the script has already loaded 50 million rows of transaction data. This calendar table part runs in a few seconds. The client for this example uses a July through June fiscal year and I left in the calculations for fiscal month, fiscal quarter and fiscal year (they use a label like FY15/16 for fiscal year).

Remember to read the comments starting at the bottom since that is how the preceding load logic works. Note the Dual function used to make sure the fiscal months sort properly from July to June in charts.

//Calendar table is for all Monday WeekStart dates for the transaction data  
TBL_CALENDAR:
LOAD
FDATE,
Month(FDATE) as CAL_MO,
Year(FDATE) as CAL_YEAR,
dual(Month(FDATE),if(num(Month(FDATE))>6,num(Month(FDATE))-6,num(Month(FDATE))+6))  AS FISCAL_MO,
dual('FY' & if(num(Month(FDATE))>6,   right(Year(FDATE),2)&'/'&right(Year(FDATE)+1,2), right(Year(FDATE)-1,2)&'/'&right(Year(FDATE),2)),
   if(num(Month(FDATE))>6, right(Year(FDATE),2),right(Year(FDATE)-1,2))) as FISCAL_YR,
Pick(Month(FDATE),'Q1','Q1','Q1', 'Q2','Q2','Q2', 'Q3','Q3','Q3', 'Q4','Q4','Q4') as QTR,
Pick(Month(FDATE),'Q3','Q3','Q3', 'Q4','Q4','Q4', 'Q1','Q1','Q1', 'Q2','Q2','Q2') as FISCAL_QTR;
//
//now, make a complete set of FDATEs between min and max
//without missing any weeks. Autogenerate keyword not needed  
Load 
Date(MINFDATE+(IterNo()*7)) as FDATE
While MINFDATE+(IterNo()*7) <= MAXFDATE;
//
//get the minimum (oldest) FDATE and maximum (newest) FDATE
Load 
Weekstart(Min(FDATE)) as MINFDATE,
Weekstart(Max(FDATE)) as MAXFDATE;
//
//First, get all of the distinct FDATE values that have been loaded 
Load date(FieldValue('FDATE', RecNo())) as FDATE 
AutoGenerate FieldValueCount('FDATE');

I didn’t use it here but a WHERE clause will work with preceding load (WHERE EXISTS… is often useful) The ORDER BY clause seems to cause an error with proceding load but you can use logic like this example where the FDATE values are built and loaded in ascending order.

★★★

Friday, November 18, 2016

MonthsBetween Function

A few days ago I needed a calculation in a loadscript that would provide the number of months between two dates. This kind of calculation comes up sometimes with supply chain planning data when the difference between the month a sales forecast is developed and the month that the sales forecast is meant to be applied, sometimes called a monthly lag, can be used to help measure forecast accuracy. QlikView provides a number of date functions but not one that can calculate the number of months between two dates.

I decided to make my own MonthsBetween date function using a document variable with input arguments. A variable like that can be used like a user defined function. I created a variable in the QlikView document and named it MonthsBetween. This is what I typed for the content of the variable (you, of course, can copy and paste from this blog post):
((Year($2)*12 + Month($2)) - (Year($1)*12 + Month($1)))

The $1 and $2 get replaced by two input arguments (which some people might call parameters) when the variable is used. The arguments must be dates. My loadscript (I was loading data from an Oracle database table) looked something like this:

BUDGET_HIST:
Load BUSINESS_UNIT,
BUDGET_AMOUNT,
CREATE_DATE,
APPLY_DATE,
$(MonthsBetween(CREATE_DATE, APPLY_DATE)) AS MONTH_LAG;
Select * from BUDGET_HIST_TABLE;

For example, if the CREATE_DATE is 1-APRIL-2016 and the APPLY_DATE is 1-JULY-2016 then MONTH_LAG is 3. If the second date is later than the first date then the result will be a positive integer number of months. If the two dates are from the same month then the result is a zero. For my data, the APPLY_DATE is always later than or equal to the CREATE_DATE so MONTH_LAG is always 0 or greater. If CREATE_DATE was later than APPLY_DATE then MONTH_LAG would be a negative number.

The calculation is nothing special -- it must have been re-created thousands of times. But, putting it into a variable makes the calculation easy to use in many places and easy to share. Test it out in a chart expression or advanced search expression

When a variable contains expression code like that and the variable might be used as part of a larger expression in a chart, then it is a good idea to enclose the entire variable contents within parentheses as I did above. It helps prevent QlikView from processing the operators in the larger expression in the wrong order.


★★★  

Thursday, December 24, 2015

Chart Adjusts to Whatever Fieldnames are Loaded


I’m often called on to validate a new data file from a client or to examine a new data extract. My first step is to load data into a copy of Steve Dark’s Data Profiler (search on quickintelligence data profiler). I’ve made a few modifications to Steve’s original document – here are a couple of the most useful.

I added a line to the loadscript that uses preceding load to add a record number to the rows being loaded. My line is added between the table label and the first line of the regular Load statement. It looks like this:

DATAFILE:  //table label
Load *, recno() as zrecno;  //this is the preceding load line
Load   //this is the beginning of your "regular" Load statement

For the second set of modifications, I added a new sheet or tab which I named Table.
On the new tab, I added a multibox and then I added, not fieldnames, but 20 expressions. You can add any number you like but 20 seemed like plenty. These are mostly copy and paste so they are not a lot of typing and you can start by copying the expression shown in this blog and pasting it into your document. The first multibox expression looks like this:

[$(=only({1<$FieldNo={1}>} $Field))]

That expression evaluates to the fieldname corresponding to field #1. Then I add a second expression to the multibox:

[$(=only({1<$FieldNo={2}>} $Field))]

You can see that in the second expression, I changed $FieldNo={1} to $FieldNo={2} so that the second expression evaluates to the fieldname corresponding to field #2. I did that for each of the 20 expressions so that the 20th expression in the multibox contains $FieldNo={20}

Then, I added a straight table chart. I used zrecno as the dimension. Then, I added 20 expressions. For the first expression, I typed this:

[$(=only({1<$FieldNo={1}>} $Field))]

And for the label, I typed this:

$(=only({1<$FieldNo={1}>} [$Field]))

Again, the rest of the expressions are similar, each time changing the $FieldNo={1} to be 2 for the second expression and its label, $FieldNo={3} for the third expression and its label and so on for each of the 20 expressions. Click OK when you’re done.

Now, you have a multibox and a straight table chart that appear as though you typed in the 20 fieldnames. But, these objects will adjust themselves to whatever fieldnames are loaded. If you load a file with less than 20 fieldnames, the “extra” ones will just be null.

You can copy and paste these objects into other QlikView documents and they will adjust themselves to whatever the first 20 fieldnames are in the document (see important info in first blog comment below). But, I think these are most useful in a document where different fieldnames can be loaded each time.

I checked Dropdown Select for all columns in the Presentation tab to make it easier for the document user to make selections on the chart. 

In order that we don’t burn up all of the memory on the computer, I added a calculation condition to the straight table chart with this expression =if(count(zrecno) < 200000,1,0) so that the chart will refuse to appear if there are more than 200,000 rows. I customized the error message for the calculation condition to say, “There’s more than 200k rows. You have to make some selections to reduce the amount of data in this chart.” A calculation condition like this is a good idea whenever you have an object that might try to show more data than is possible with the available memory.

  ★★★ 

Saturday, March 7, 2015

How Many Days Since That Document Reloaded?

I was challenged to write a simple blog posting. Here it is, although it still has some code in it.

A recent project required a document to “know” how many days ago a separate .qvw document had been reloaded. This statement in the loadscript uses the pathname to the separate document and puts the answer to “how many days ago” in the variable vDaysAgo:

LET vDaysAgo = Now() - FileTime('e:\ventures\PriceAnalysis.qvw'); 


  ★★★

  

Saturday, January 31, 2015

General File Loading Technique

Here’s a technique I first used several years ago when I started a project where we had several dozen data files, all of them containing different data, in a delimited format and each of them with a header record as the first record containing the field names. The loadscript code below shows how we could load each of the files into different QlikView tables using the same loadscript.

The code below has the filename in the code but it could be loaded into the QlikView document by including the filename as a variable in the command line execution of QlikView (for an example, search on "qlikview maven command line and automation").  Other options include building a loop and loading each of files in a folder (search on “qlikview maven loading all of the files from a folder”); or, if you are comfortable with macro module code, you could use a browse-for-file technique to make it easy to find and load a filename (search on “qlikview maven browse-for-file macro button”).

Here’s the loadscript example code:  (some of the long lines are wrapped around as you view the code in the blog window but they will go back to normal if you copy and paste the code into a text file or loadscript edit window) 

Set xfile='customer_data_12.csv'; //file pathname
Let xfile=Upper('$(xfile)'); //optional convert to upper case

FIELDDAY:  //load first record only 
Load * from $(xfile)
(txt, no labels, delimiter is '|') 
where recno()=1;  //note "no labels" 

Set loadstatement='Load '; //now, build the Load statement
For i=1 to 100
Let fn=trim(SubField(peek('@$(i)',0,'FIELDDAY'),';',1));
if '$(fn)'='' then 
  exit for
  end if;
if $(i)=1 then
  Let loadstatement='$(loadstatement) @$(i) as [$(fn)]';  
else
  Let loadstatement='$(loadstatement), @$(i) as [$(fn)]';
  end if
Next

$(xfile): //QlikView table name
$(loadstatement) 
FROM
$(xfile) 
(txt, no labels, header is 1 line, delimiter is '|'); //change delimiter character as needed

Drop table FIELDDAY;

Store $(xfile) into $(xfile).qvd (qvd); //optional save as qvd
//Drop table $(xfile);  //optional drop table

Change the file attributes as needed for your situation: for example, Excel files, skip over unneeded header records, different delimiter, etc.

There are two important ideas illustrated in the example:  first, the idea of loading the header record of a file and building a custom load statement on the spot; second, the idea that you can construct any loadscript statement or part of a statement in a variable and execute it just by putting the variable in the right place in the loadscript.

This technique helped me in a recent project where the data files were supplied with field names in the header record but all of the field names needed a little modification in order to use them. Some of them I converted to upper case and some of them needed to have special characters removed. It was easy to add the necessary text functions to edit the field names as the code assembled the load statement.

It will help you debug and validate this technique if you set the document properties option to write out a loadscript log file when the loadscript runs. The log file will show all of the lines in the loadscript as they were executed with variables fully expanded.

  ★★★

Sunday, January 11, 2015

A Little More Optimized Load

This is a continuation of last week’s blog topic (search on "QlikView Maven Optimized QVD Load With a Where Clause and Dates"). 
I had a load statement in a loadscript loading from a qvd file that (simplifying a bit) looked like this:

INVOICE_DATA:   
Load CUST_NO,
ApplyMap('custprimap',CUST_NO) as CUST_PRIORITY,
INVOICE_NO,
INVOICE_AMT
from invoice_data.qvd (qvd) 
where exists(INV_DATE); 

It was written using a mapping table feature for customer data to avoid a left join load for performance reasons. But, the load from the qvd file wasn’t an optimized load because of the ApplyMap function. The table contained many millions of rows and using an optimized load would save quite a bit of time. 
I was able to get an optimized load and still have the ApplyMap functionality by recoding that portion of the loadscript to also use a Join and a super-fast load from FieldValue like this: 

INVOICE_DATA:   
Load CUST_NO, //optimized load
INVOICE_NO,
INVOICE_AMT
from invoice_data.qvd (qvd) 
where exists(INV_DATE);

Join (INVOICE_DATA) load //preceding load
CUST_NO,
ApplyMap('custprimap',CUST_NO) as CUST_PRIORITY; 
Load FieldValue(CUST_NO,IterNo()) as CUST_NO
Autogenerate(1) while not isnull(FieldValue(CUST_NO,IterNo()));



______________ _________________ ________________

Note to Santhosh:  You can add special comments to a chart expression that will make it easier to find specific expressions with the Settings->Expression Overview, Find feature. For example, you might use an expression like this:
sum(CONTRCT_S1 * CNTRY_ADJ) //per NATO procurement 1987.36.c
and then it would be easier to find the expressions that need to be changed when procurement rules change by doing an Expression Overview Find on "NATO procurement".
Our otherwise well-intentioned friends who believe all chart expressions should be contained within variables cannot easily use this suggestion.

  ★★★

Wednesday, January 7, 2015

Optimized QVD Load With a Where Clause and Dates

If you are working with large quantities of data and especially if you are working with large data sets of historical data where you add a portion of new data to the historical data each week (an incremental load), then you will want to be working with qvd files in your loadscript. And if you are working with qvd files then you certainly want to load data from the qvd files with an optimized load. The optimized load is many times faster than loading the same data with an unoptimized load. 

A common problem is that adding a where clause to the load statement will cause the load to be unoptimized. Some types of where clauses though, can be used with an optimized load. A simple where exists type of where clause, for example, can usually be used with an optimized load.

Every incremental load process is a little different but if your process involves dates then this idea may help. This idea is for situations where you want to load data from a qvd file and you want your where clause to refer to a date within the data. Here’s an example:

Imagine that you have a date like 01/01/2014 in a variable named start_dt and another date like 12/31/2015 in a variable named end_dt and you want to load invoice data where the invoice date, inv_date, is between start_dt and end_dt. First, build a little table of all possible dates between start_dt and end_dt. Use this in the loadscript before loading any other data containing a INV_DATE field:

/* Build table of dates just for the "where exists" clause */
TMP_DATE_TABLE:
Load date(date('$(start_dt)')+IterNo()-1) as INV_DATE 
AutoGenerate(1)
While date(date('$(start_dt)')+IterNo()-1) <= date('$(end_dt)'); 

That code can build a large table of dates in your loadscript in just a few seconds. Now, you can code the load from qvd like this:

INVOICE_DATA:   //this is an optimized load
Load * from invoice_data.qvd (qvd) 
where exists(INV_DATE);
Drop table TMP_DATE_TABLE; //no longer needed

That table of dates can be varied as needed for your specific application. For example, using (IterNo()-1)*7 could give you a set of week start dates. Use whatever method makes sense to create a list of the dates that you would want to load if they exist within the qvd file data. Remember, the basic idea here is to build a table of values you would like to load from the qvd file. It is not limited to dates and not all of the values in your temp table have to actually match something in the qvd file.

____________ _________________ _________________

Note to Santhosh:  If you need to quickly find the maximum or latest invoice date from the qvd file in order to use it with date arithmetic then you can get it in a variable by doing something like this in your loadscript before loading any other tables with the INV_DATE field:

TMP_INV_DATE:  //this is an optimized load
Load INV_DATE FROM invoice_data.qvd (qvd);
TMP_MAXINV_DATE:
Load Max(INV_DATE) as maxinvdate;  //preceding load
Load FieldValue('INV_DATE',IterNo()) as INV_DATE
AutoGenerate(1)
While not Isnull(FieldValue('INV_DATE',IterNo()));
Let var_maxinvdate = Peek('maxinvdate',-1,'TMP_MAXINV_DATE');
Drop tables TMP_INV_DATE, TMP_MAXINV_DATE; //not needed

★★★



Friday, January 2, 2015

Testing the Aggr Function Used in an Expression

I’m often asked to take a look at someone's QlikView document with the explanation that “the expression doesn't work”. If it is a complex expression with multiple parts or multiple functions I suggest taking each part of the expression and testing it to make sure it is performing the expected function. When one of the parts is something simple like  sum(InvoiceAmt)  then most people get the idea that they can put that little part of the expression into a temporary chart column or text box and then, based on the current selections and/or chart dimension, check that the function is really showing the sum of the invoice amounts.

If the part of the expression that needs to be tested is an aggr function then you need to know how to check it because the aggr function does not return a single value like the sum function; it returns an array of one or more values. An array of multiple values will usually show up as a null in a chart expression or text expression. If the array happens to have only one value or if all of the values in the array are exactly the same then you will be able to see the result of the expression but its no guarantee that the function is working as intended.

In order to test the aggr function itself, I recommend surrounding the aggr function with a concat expression that will help you analyze the array of the values returned by aggr. For example, if you had a straight table chart with a dimension of geographic region, you might have an expression like this to calculate average sales rep invoice amount totals:
  Avg( Aggr( sum(InvoiceAmt), SalesRep) )

In order to view the values returned by the aggr function, you could create a temporary column and copy and paste the aggr portion of the original expression and surround it with a concat function that might look like this:
  Concat( Aggr( sum(invoiceamt), SalesRep), ' / ' )

That would create a text string showing the individual values in the array separated or delimited by ' / '
Another useful delimiter is ', ' which creates a text string showing the individual values separated by commas.

When using this technique with a chart column there might not be enough space to show all of the array values separated by the delimiter. If that happens, right-click on the chart cell you would like to examine and choose Copy to Clipboard - Cell Value (in a text box you may right-click and choose Copy to Clipboard - Text) and then paste the value into a text file or spreadsheet or other convenient place to review a long text string.

If the array of values you see using this method is what you expected then check out all of the other sections of your expression – if all of them are working as you expect then assembling them into a single expression should work in your application. I often recommend building and testing each of the parts as a development technique when building a complex expression.

One other method for examining the array of values returned by the aggr function is to use it in a temporary chart and use the aggr function as a calculated dimension. A calculated dimension is, by definition, an array of values and it may help you review the values. 

_______________ ___________________ ___________________

Note to Santhosh:   You can remove leading zeros from a numeric identifier like a SAP 18-digit material number in the loadscript with a line like this:
   replace(ltrim(replace(MATNR,'0',' ')),' ','0') as MATNR_NO_ZEROS,

If you foresee the need to do a left join load using the original identifier then keep it as a separate field. If you would like to turn this into a user-defined function like I wrote about three years ago (search on "qlikview maven user-defined function") then do this:

  1. Create a variable named DropZeros
  2. Paste this into the variable:  replace(ltrim(replace($1,'0',' ')),' ','0')
  3. Now, you have something that works like a user-defined function. It could be used in your loadscript line like this:

   $(DropZeros(MATNR)) as MATNR_NO_ZEROS,



★★★

Saturday, April 5, 2014

Inline Load with Where Clause


You wouldn’t ordinarily use a Where clause with an inline load… after all, if you didn’t want all of the rows why bother typing them? But, as an example, I had a project two weeks ago where an inline load with a Where clause helped solve a problem. It was a QlikView document that could load data from any one of several dozen companies. The data included a company code but not the company name or any other friendly details for me to use in the document.
I coded an inline load with a row for each of the possible companies. In the loadscript, after loading the company data, I coded an extract of the company code from the data and stored it in a variable named var_cmp_code.

The inline load, simplified here, looked something like this:
COMPANY_INFO:
Load * inline
[cmp_code, company_name, company_industry
BA, Boeing, Aerospace
CCC, Chesterfield Coal Co, Mining and Minerals
LPL, London Pastry Ltd, Stale Biscuits]
Where cmp_code = '$(var_cmp_code)';

Since each load of the document is loading data for one company, the Where clause on the inline load makes it load one row only and each field name, therefore, has only one value. That allowed me to use company_name or any of the other field names in chart titles and text boxes. For example, I used a text expression in a chart title like this:  ='Future Health Care Risk by Year for ' & company_name
Remember that an inline load is like any other load except that the data is typed directly into the loadscript instead of being loaded from a separate file or database. In addition to a Where clause, you could code a While, Autogenerate, mapping table, Concatenate, Qualify, or the various types of joins available in a loadscript.

Wednesday, April 2, 2014

Oracle Hint in the Loadscript


People who load data from an Oracle database into QlikView may sometimes want to use an Oracle hint in the SQL query code. An Oracle hint tells Oracle about how you would like it to plan and execute the query in order to get faster, more efficient execution.
The Oracle hint syntax is actually a comment embedded in the SQL query. For example, the hint to tell Oracle to use up to four parallel processors might look like this in the first line of the query:
SELECT /*+ parallel(4) */ CUSTOMER, SUM(ORD_QTY) FROM   . . .

If you simply code a query like that in the loadscript you won’t get an error or a warning, the query will return data, but you won’t get the performance improvement you were hoping for either. That’s because QlikView doesn’t ordinarily pass comments through to the SQL processor. Oracle would never see your hint.
What is needed, is for you to tell QlikView not to strip out the comments before passing the query into the SQL processor. Code a line like this before your loadscript table definition:

Set StripComments = 0;
ORDER_TABLE:

SELECT /*+ parallel(4) */ CUSTOMER, SUM(ORD_QTY) FROM   . . .
The StripComments variable is a system variable that controls QlikView behavior. Setting the variable to zero tells QlikView not to strip out the comments. It is a good practice to turn StripComments back on after the query with a line like this:
Set StripComments = 1;

One thing to look out for is any other comments in your code. If you have a regular loadscript comment //(a comment beginning with two slashes) mixed in with your SQL code then it will cause a SQL error when it gets passed in to the SQL processor.



Monday, May 6, 2013

Loading an Unstructured Text File


This is the start of a series of postings about dealing with unstructured text data. I've used QlikView to help analyze message log files and error files and even program source code – there are some things QlikView can do that Notepad style text editors cannot.
First, we need to load the data. It isn’t always a completely unstructured file like my example, often the text is one or more fields inside a structured file or database.  There are various ways to load a text file; this is what I usually put into the loadscript:
TEXTFILE:
LOAD 
RecNo() as recno,
@1 as rectext
FROM
[$(File_Pathname)]
(txt, no labels, delimiter is \x7, no quotes); 

That code will load each record from the text file into the rectext field. I use a variable for the file pathname because I usually use a loadscript like this with a document where the user can supply a filename through an inputbox. Alternatively, you could add browse-for-file functionality like I wrote about in March 2009 (search on QlikView Maven Browse-for-File Macro Button). Because the data is unstructured and the document objects are meant to be used with unstructured data, the document works well as a general-purpose tool that can be used with many types of data in varying situations. 

I defined the file as though it is delimited and made the specified delimiter an uncommon character I don’t expect to find in the data. 

The record number, loaded as recno, from the input file is a useful piece of data and we’ll be using it in some of the example objects over the next few postings.

Saturday, January 12, 2013

Entire Load Statement in a Variable

QlikView document variables contain text. The software allows us to use variables in different situations and, for such a simple idea, they add considerable flexibility to help you satisfy requirements.
 

Most people use variables for field names, label text, expressions, and other relatively short bits of text. You can store longer text in a variable that may be useful. For example, I recently stored this entire bit of loadscript code in a variable named var_UOM_table:

/* Unit of Measure - loadscript code */
if NoOfRows('TBL_UOM_SELECT')>0 then
    Drop table TBL_UOM_SELECT;
end if
TBL_UOM_SELECT:
Add Load * inline
[Unit of Measure: uom_conv
Box: 1
Case: 0.1
Std Drum: (1/DRUMFCTR)
] (delimiter is ':');



In the loadscript I simply wrote the single line
$(var_UOM_table)

and that expanded to the entire segment of loadscript code when the document was reloaded.

This was useful to me for a document that loaded variable definitions from a database. Remember that variables can contain text expressions and accept arguments. Keep variables like this in mind for loadscript situations that require standard bits of text that may need to be repeated or shared between documents.


★ ★ ★







Monday, December 31, 2012

Execute Oracle Procedure From The Loadscript


I worked on a project a few months ago where several Oracle database tables had been designed specifically to support a QlikView document. The tables contained summarized data extracted from a large database. Often, database software, running on a large server can process large amounts of data more efficiently than QlikView can do it within the loadscript. In this particular case, the solution would only work correctly if the tables were updated shortly before loading the QlikView document. I decided to try something different and execute an Oracle stored procedure from a loadscript command. The procedure would update the tables and then the remainder of the loadscript could load data from the tables with SQL select statements.

The syntax for the procedure execution took a little trial and error. I put the execution immediately after the ODBC connect statement. This procedure takes a parameter which is supplied from a document variable. The procedure, named update_rpt_data, is implemented within an Oracle package named corp_report. The loadscript statement looks like this:

// Call Oracle data summarization procedure
sql call corp_report.update_rpt_data('$(var_COUNTRY)');

In order to make this work correctly with Oracle, I had to add a COMMIT statement as the first executable line in the stored procedure. Without the commit, Oracle threw an error.

If you use this technique, remember that the procedure execution may take extra seconds or minutes and this will add to the time required for the loadscript. Other types of database software may require different syntax; this example applies only to Oracle. Note that the statement only executes the procedure – no data is being returned directly from the procedure.