Sunday, February 12, 2012

PostgreSQL: Auto generating a sample dataset

Sometimes you just want to create simple sample datasets for tables quickly. If you are lucky your database provides some native SQL functions for this purpose.

PostgreSQL has quite an interesting recursive WITH statement which allows you to create a loop: You can define in example a value that is incremented with each iteration until a certain condition is met. This comes in quite handy when creating sample data for a date dimension in example:

Let's keep it very simple:


WITH RECURSIVE date_generator(date) AS (
   VALUES (DATE '2012-01-01')
 UNION ALL
   SELECT date+1 FROM date_generator WHERE date < DATE '2012-01-15'
)
SELECT
date
, EXTRACT(DAY FROM date) AS day
, EXTRACT(MONTH FROM date) AS month
, EXTRACT(QUARTER FROM date) AS quarter
, EXTRACT(YEAR FROM date) AS year
FROM
date_generator
;


In the WITH statement we provide a start date (in this case 2012-01-01) and increase it by 1 until a specific end date (in this case '2012-01-15') is reached. In the main query we make use of the auto generated dates by extracting various date periods. The output looks like this:

Now you can easily change to above query to insert the auto generated data into a table. This is a very elegant solution as everything can be set up using standard SQL. Have look the official PostgreSQL documentation for more information.

2 comments:

  1. Looking at this elegant code am I right in assuming that the first EXTRACT line in the example above should be;
    ", EXTRACT(DAY FROM date) AS day"

    ReplyDelete
  2. Well spotted! Thanks for pointing this out! I'll correct it immediately.

    ReplyDelete