Quick Step By Step Guide
You can find the main Pentaho Wiki doc here.
This is an updated (2013-08-19) and extended version of the original tutorial which I posted here a few years ago. It will take you more than 10 seconds now to read through it, but instructions should be more precise now than before.
Specifying JNDI and adding the JDBC driver
- Define a JDNI Connection. You have to use the same JNDI connection as you specified on the BI Server (in the Pentaho Administration Console).The JDBC details can be specified in the jdbc.properties file in the following directory:<pme-root-dir>/simple-jndi
Add the following (amend to your requirements):dwh/type=javax.sql.DataSource
dwh/password=The first part before the forward slash is the JDNI name (so in this case dwh).
- Check if the required JDBC driver is installed under<pme-root-dir>/libext/JDBC
If not, download the JDBC driver from your vendor’s website and copy the jar file into this folder.
Importing the physical tables
- Start PME by running the following command in the PME root directory:sh ./metadata-editor.sh
- Click on File > Save and specify a good name for your metadata model.
- Right click on Connections on the top left hand side and choose New Connection .... Define your connection details, make sure you choose JNDI in the Access selection and specify the same JNDI name in the Settings section as you originally specified in the jdbc.properties file.
Some interesting options are found in the Advanced section: If you work with a database like PostgreSQL or Oracle that support proper schemata, you can define the default schema to use here. Also, if you have columns of type boolean you can also enable support for them:
After entering all the details, click the Test button to make sure that your connection details are correct. Once you get a successful return message, click OK.
- Right click on the database connection you used created choose Import Tables:
- Expand the database node (dwh in the screenshot below) so that you can see all the imported tables:
- Specify table properties: Double click on each table to specify various settings. In example specify if it is a fact or dimensional table. For measures configure as well the aggregation type. If you want to add calculated fields, it's time to do so now: Double click on the respective table. Once the window is open, click on the + icon. Give the field a new name (i.e. PC_Amount_of_users). Define the aggregation type (i.e. Count). Define Data Type. If you don't know the length and precision yet, set it to -1. Define the formula (if you have just a simple count or sum, then only write the name of the column in there). That's it. (Field type can stay on "other").
In the formula field you can use database specific functions as well (i.e. "YEAR(date)"). In this case you have to click on "Is formula exact?".
You can add other properties like text alignment or date mask by clicking on the + icon.
Understanding Is the Formula Exact?
You can create add columns which are based on native SQL fragments, in example:
((CURRENT_DATE - start_date)/30)::int + 1
You specify this in the Formula Value field:
If you tick Is the Formula Exact? this basically means that the PME engine will not try to interpret this SQL fragment but instead push it directly to the database.
The disadvantage of this approach is that you might end up using functions which are specific to your database, so the model will not be that easily portable to other DBs (in case you ever have to migrate it).
Another common use case is to add a measure in case your raw data table doesn’t have one:
Defining a business model
- Right click on Business Model and select New Business Model. Name it etc.
- Drag and drop the tables onto the main working area.
- Double click on the table and go to Model Descriptor. If the type is incorrectly set (or not applicable for this model) click on the overwrite icon and define the respective table type (fact or dimension). Click OK.
- In order to create relationships between tables, select the two tables while pressing down the CTRL key and then right click on the last table and choose New Relationship:
Another way to do this, although not that convenient, is to right click on the work area and choose New Relationship:
Create a business view
- Once the business tables and relationships are established, we can create the business view. Right click on Business View and select New Category. An easier way to do this is to choose Tools > Manage Categories (or right click the Category Editor icon in the toolbar). This will bring up the Category Editor dialog: Just click the + icon to add new categories. Define Categories, i.e. Date, Measures, Countries etc. Categories are basically buckets that help you organize the various business columns.
- Next we want to assign business columns to each category. If you created your categories in the tree view, right click on Business View and choose Manage Categories. Once in the Category Editor, use the arrows to move the fields into the categories.
Testing the metadata model
Now that the main metadata model is defined ... it is time to test the model. Click on the Query Builder icon in the toolbar and run some test queries. You can check the generated SQL by clicking on the SQL icon.
Publish the metadata model to the Pentaho BI Server
If testing is successful, publish the model to the BI server (Click on File > Publish To Server ...). The final metadata model is saved as an XMI file. On the BI Server, there can be only one XMI file per solution folder. Make sure that
- You have an account on the BI server
- The publisher password on the BI Server is set up and you know it.
- You know the name of the solution folder that you should publish the model to.
- The URL to publish to is something like http://localhost:8080/pentaho/RepositoryFilePublisher
Make sure you have RepositoryFilePublisher at the end of this URL!
Tips and tricks
Make use of Concepts
In the toolbar you can find the Concept Editor icon. Concepts are pretty much like CSS style definitions. Concepts can be assigned to each data point and are used in the final reports as default formats. One of the most important properties is probably Mask for Number or Date, which allows you to enter standard formatting strings (e.g. #,###.00).
To assign a Concept simply right click on the data point and choose Assign Parent Concept.
Referencing the same table more than once
In case one dimensional table is referenced more than once by your fact table, just drop the dimensional table several times into the business view and rename each of them. Then create separate relationships for all of them.
How to create formulas
Take a look at Pentaho Wiki for an introduction.
How to implement data security
- Click on your business model. Go to Tools > Security and import the roles from the Pentaho BI Server by entering following URL (amend if necessary):http://localhost:8080/pentaho/ServiceAction
This will allow you to restrict the data for certain roles. If the connection works ok, you will see an XML extract of the roles definition.
- Go to your Business Model (this is one hierarchy below Business Models and has a brown briefcase symbol next to it) and right click, choose Edit. It is important that this is implemented on this level as otherwise it won't work.
- In the Metadata Security section add all the users/groups that you want to allow access. Assign the Update right to users that can save and edit their ad-hoc reports.
Also, if one user/group should have access to everything, you have to set the constraint to TRUE().