Thursday, July 23, 2009

Pentaho Xactions: Prepare your data for the Google Visualization

Pentaho Xactions: Prepare your data for the Google Visualization API


We want to create a line chart like the one shown in http://code.google.com/apis/visualization/documentation/gallery/linechart.html . Take a good look at the code below the chart, this is the one that we will try to prepare here.

Let's kick off with a simple MDX query that will display the amount of new registrations for the period 2009 Week 10 to Week 22:

SELECT 
[Country].Children
ON COLUMNS,
{[Registration Time Weekly].[2009].[10]:[Registration Time Weekly].[2009].[22]}*{[Measures].[New Registrations]}
ON ROWS
FROM
[Users]


I created a "Get Data From OLAP" step in my Xaction in the Pentaho Design Studio and inserted this query plus specified all other necessary parameters.

This step will bring back a query result. Name the result "query_result". To check what it looks like, you can drag and drop the "query_result" into the "Process Outputs" area. Save the file and execute it on the Pentaho BI Server. Now you will get a vague idea how the data looks like.

Now, let's write some JavaScript. Create a new step by choosing "Get data from > JavaScript". For the script inputs choose our "query_result". Now paste following JavaScript code into the main text area.

It is important to understand that x and y axis titles are stored in the metadata object. As we need these titles for the Google Chart definition, we will have to read them out. 
In the y axis, we have actually two titles, the first one being "New Registrations" and the second one being the week numbers. We will have to add the week numbers to our main dataset that we pass on to the Google chart definition. We will use getRowHeaders() and a loop to read out these titles. It is worth mentioning that in the array the week numbers will be in the first column and "New Registrations" in the second column (so actually in the opposite way as it is displayed). 

Let's start with our JavaScript:

var thedata="";

// we get a simple row and column count of the data set
rowCount=query_result.getRowCount();
colCount=query_result.getColumnCount();

var meta = query_result.getMetaData();

// get the column headers
var colHeaders = meta.getColumnHeaders(); //getColumnHeaders() returns object[][]

var colName="";

// create the code for the google chart definition
var i = 0;
for ( i = 0 ; i < colCount ; i++ ){
   colName+="data.addColumn('number','"+colHeaders[0][i]+"');\n"
     // use colName for something
}

var rowHeaders = meta.getRowHeaders(); // returns object[][]

var rowName="";

var i = 0;
for ( i = 0 ; i < rowCount ; i++ ){
// get the second row of the array as there are two dimensions in the y axis
   rowName+=rowHeaders[i][0]+"\n"
}


It is worth testing the x and y axis title output separately before going ahead. Make sure that everything looks fine! This is the first part. Define calName as a script output, then drag and drop "colName" from the "Process Actions" area to the "Process Outputs" area. Save everything and execute the Xaction on the BI Server. You will see that we have now created some part of the code that we will use in the Google Visulatization Chart definition. Do the same test for "rowName", and if everything is fine, carry on with this code:

// loop trough each row and column
for(row=0; row<rowCount; row++)
{
thedata+="data.setValue("+row+",0,'"+rowHeaders[row][0]+"');\n"
    for(column=0; column<colCount; column++)
    {
        // check if it is not a number
        if(isNaN(query_result.getValueAt(row,column))) {
        // now we prepare the string for that our Google Visualization Chart expects - > for text
        // Important: the column index has to be increase by one as we have made the week axis title as our first column!
        thedata+= "data.setValue("+row+","+(column+1)+",'"+query_result.getValueAt(row,column)+"');\n"
        }
        else
        {
        // now we prepare the string for that our Google Visualization Chart expects - > for numbers
        // Important: the column index has to be increase by one as we have made the week axis title as our first column!
        thedata+= "data.setValue("+row+","+(column+1)+","+query_result.getValueAt(row,column)+");\n"
        }
    }
}

Test the output for "thedata" and check if it is in line with the Google Chart definition.

Now, let's create a Message template to prepare the final html output. Define colName, rowCount and thedata as input. Insert this code into the main text area:

<html>
  <head>
    <script type="text/javascript" src="http://www.google.com/jsapi"></script>
    <script type="text/javascript">

//-------------------- Column Chart ------------------//
      google.load("visualization", "1", {packages:["linechart"]});
      google.setOnLoadCallback(drawChart);
      function drawChart() {
        var data = new google.visualization.DataTable();
        data.addColumn('string', 'Week');
// get column heads from dataset metadata
{colName}
        data.addRows({rowCount});

{thedata}

        var chart = new google.visualization.LineChart(document.getElementById('chart_div'));
        chart.draw(data, {width: 400, height: 240, legend: 'bottom',  title: 'New Registrations By Channel Last 7 Days'});
      }
    </script>
  </head>

  <body>
<div id="chart_div" style="float:left;clear:both"></div>

  </body>
</html>


Then define the output name as "out". Drag and drop it into the Process Outputs area (make sure you delete any other outputs before), then click on out in the Process Outputs area and change the "Name" from "out" to "content" in the Output Destination.

That's it. Save everything and then you should have a fancy Google line chart working.

You can download the file from here (it's not exactly the same, but quite similar).

Please note: If you are using SQL, the JavaScript has to be different. I provide one example using SQL here.








 



19 comments:

  1. Really good post. Where can I get the source code of this example? Do you have the xaction? My e-mail is: caiomsouza@gmail.com

    ReplyDelete
  2. This comment has been removed by the author.

    ReplyDelete
  3. Thanks Caio! I am not sure if I still have the xaction file. I'll have a look tomorrow in the office and send you an email if I find it.

    ReplyDelete
  4. where can we get the source code for this example

    ReplyDelete
  5. I try to upload the example within the next 3 days ...

    ReplyDelete
  6. I added the file. You can find the link at the end of the article.

    ReplyDelete
  7. Thank Diethard Steiner. Looks like the url is not correct and its not giving the required information. Can you please send me the required info.

    ReplyDelete
  8. It should work now, please try again. You will be redirected to a Google Docs download page.

    ReplyDelete
  9. Thank you very much for the file. I did something similar and see that the javascript output variable is not being replaced with the actual when we place that in the message template text as {thedata} according to your example. So can you tell me how to get that value.

    ReplyDelete
  10. In this case the javascript will be different ... I'll upload a SQL example.

    ReplyDelete
  11. An example using SQL is available now (You can find it at the end of the article).

    ReplyDelete
  12. Thanks Diethard, i still see it not working instead displaying the variable as {thedata} only using your code also. In the log its showing this

    WARN [TemplateUtil] not found: thedata

    Is there anyway i can send the action file to you which i used for your reference so that it can be made working

    ReplyDelete
  13. Please leave a comment with your email address and a link to download your file. Please base your file on the sampledata DB that comes with the Pentaho BI Server. I cannot promise you that I will have a chance to look at it, as I am quite busy now.

    ReplyDelete
  14. I am not aware on how i can share the file you. i would like to email the file so please let me know how i can do or i their anyway i can attach the file here.
    Sorry for disturbing you.

    ReplyDelete
  15. You can use something like dropbox.com or you can even make it available on Google Docs. Otherwise, just post your email here and I'll get in contact with you.

    ReplyDelete