Sunday, March 11, 2012

PostgreSQL: Auto generating a sample dataset (II)

Some time ago I presented a simple solution to generate date dimension data in PostgreSQL using standard SQL. The below approach is another take on this and probably a little bit easier to read then the previous approach.
This time I made use of the GENERATE_SERIES function. As a simple example, we first use a Common Table Expression to generate a dataset which holds 10 successive dates and then in the main query we extract some date information:


CREATE TABLE
test.dwh.dim_date
(
date_tk BIGINT
, date DATE
, day SMALLINT
, month SMALLINT
, quarter SMALLINT
, year INT
, iso8601_year INT
, iso8601_week SMALLINT
, iso8601_dow SMALLINT
)
;


INSERT INTO
test.dwh.dim_date
WITH date_series AS (
SELECT
DATE(GENERATE_SERIES(DATE '2012-01-01', DATE '2014-01-10','1 day')) AS date
)
SELECT
CAST(TO_CHAR(date,'YYYYMMDD') AS INT) AS date_tk
, date 
, EXTRACT(DAY FROM date) AS day
, EXTRACT(MONTH FROM date) AS month
, EXTRACT(QUARTER FROM date) AS quarter
, EXTRACT(YEAR FROM date) AS year
-- the ISO 8601 year that the date falls in
-- Each ISO year begins with the Monday of the week containing the 4th of January, 
-- so in early January or late December the ISO year may be different from the Gregorian year.
, EXTRACT(ISOYEAR FROM date) AS iso8601_year
-- The number of the week of the year that the day is in. By definition (ISO 8601), weeks start on Mondays and the first week of a year contains January 4 of that year. In other words, the first Thursday of a year is in week 1 of that year.
-- In the ISO definition, it is possible for early-January dates to be part of the 52nd or 53rd week of the previous year, 
-- and for late-December dates to be part of the first week of the next year. 
-- For example, 2005-01-01 is part of the 53rd week of year 2004, and 2006-01-01 is part of the 52nd week of year 2005, 
-- while 2012-12-31 is part of the first week of 2013. It's recommended to use the isoyear field together with week to get consistent results.
, EXTRACT(WEEK FROM date) AS iso8601_week
-- The day of the week as Monday(1) to Sunday(7)
-- This is identical to psql dow except for Sunday (which is 0). This matches the ISO 8601 day of the week numbering.
, EXTRACT(ISODOW FROM date) AS iso8601_dow
FROM 
date_series
;


2 comments: