Friday, January 31, 2014

Easy way to pass down all parameters to Pentaho subreports

At yesterday’s Pentaho London User Group (PLUG) meetup I discussed with Thomas Morgner again the topic sub-reports. For a long time I’ve tried to explain that sub-reports are a concept that shouldn’t be exposed to the actual report designer. We should all just start with a blank canvas (with no bands) and add objects (tables, crosstabs, charts, etc) and then link these objects to the data sources (frankly quite similar to how it is done in some other report designers or CDE). And parameters should just be available everywhere without having to map them to the specific objects.
Anyways, during this discussion Thomas mentioned that there is actually no need to specify all the parameters in the parameter mapping dialog in the subreport. You can just add one mapping with stars (*) in it and all parameters will be passed down automatically to the subreport. Frankly, I was puzzled and astound about this … I remembered the days when I was working on monster reports with 30 or 40 subreports and all the time had to define 17 or so parameters for each of them. Why this was not documented anywhere, I was quite wondering. Needless to say that this is a real time safer!
My next question was then, why this was not the default behaviour? If I don’t specify any parameter mappings, PRD should just pass down all parameters by default. So, I just created this JIRA case therefore and we all hope that this will be implemented as soon as possible. Please vote for it!

So excited as I was about these news, this morning I had to quickly test this approach. Here the mapping in the subreport:
Then I just output the values in the details band of the sub-report:
And here is how the preview looks:
Now that was easy! Thanks Thomas!

Thursday, January 2, 2014

Building a Data Mart with Pentaho Data Integration (Video Course)

I have been an enthusiastic follower of the Pentaho open source business intelligence movement for many years. At the beginning of 2013 I got asked to create a video tutorial/course on populating a star schema with Pentaho Kettle. This was my first foray into video tutorials. This video is now available on the Packt website.

To me the most interesting experience on this project was finding an open source columnar database. Certainly I could have just gone down the road of using a standard row-oriented one: But having worked on projects which made use of commercial columnar databases, I quite well understood their advantage. To my surprise, the landscape of open source columnar database was quite small. There has been some revival of sorts in the Hadoop world with Impala etc (using dedicated file formats), but this was at that time probably a bit too much cutting edge. The tutorial required a DB, which had established itself for some time and was easy to install: MonetDB. This is the same DB which is actually used by Kettle as well for Instaview.  This gave me the opportunity to discuss bulk loading and talk about some advantages of columnar DBs.

Creating these videos was not quite as easy as I initially anticipated. I spent actually quite a lot of time on this project and at the end of 2013 rerecorded most of the video sessions to fix some pronunciation problems (Although I’ve lived in the UK for 9 years now I can’t quite hide my roots ;)) as well as rewriting all the files to work with PDI v4.4 (initially I was working with a trunk version of PDI v5).

I do hope that these videos provide the viewer with a nice introduction into this exciting topic. As I mention at the beginning of the course, this is not an introduction to Pentaho Kettle in general - I do assume that the viewer already has some basic Pentaho Kettle knowledge. Furthermore I decided to only focus on the Linux command line - but it shouldn’t be all to difficult for the viewer to translate everything to a Windows or Mac OS X environment as well. Is this course perfect? I don’t think so - but for my first foray into the video tutorial world I do hope it is worthwhile and teaches the viewer a few tips and tricks.

Lastly I want to thank my reviews for their support and their honest feedback, Unnati at Packt Publishing for the administrative side and finally Brandon Jackson for his help, support and work on some bugs related to MonetDB bulk loader!