OBriens tower
Musings on software development, Linux and business

Archive for the 'databases' Category

What’s wrong with this java code?

Wednesday, April 16th, 2008

Can you spot the bug in this code?

01   Connection conn=null;
02   Statement st = null;
03   ResultSet rs = null;
04   try{
05    conn = getConnection();
06    st = conn.createStatement();
07    rs = conn.executeQuery("select foo from bar");
08    ...
09  }finally{
10    if(rs!=null) rs.close();
11    if(st!=null) st.close();
12    if(conn!=null) conn.close();
13  }

Answer: If an SQLException is thrown at line 10 or 11, line 12 will not be executed. If line 12 is not executed some resources may be lost.

Yes, the likelihood of an exception being thrown at line 10 or 11 is low, but good java programmers will avoid leaking resources by defensive use of try-catch blocks. Use the pattern of starting the try block immediately after allocating a resource. Here’s a better way to write the same block of code:

01   Connection conn =  getConnection();
02   try{
03     Statement st = conn.createStatement();
04     try{
05       ResultSet rs = conn.executeQuery("select foo from bar");
06        try{
07         ...
08        }finally{
10          rs.close();
11        }
12     }finally{
13       st.close();
14     }
15   }finally{
16     conn.close();
17   }

To Pivot or not to Pivot?

Wednesday, April 25th, 2007

The concept of Pivot style reports have been around for quiet a while, I have previously been exposed to them in excel, and in SQL 2000 it was possible to create them using the case statement, often referred to as “CROSS-TAB Reports”. In SQL Server 2005 two new operators Pivot and Unpivot have been added to provide this same functionality, which is to rotate data from a state of rows into columns, and vice versa. These new operators also have the added intention of simplifying the syntax that a complex series of case statements would previously have required.

Lets look at an example
Assume we have a table called ThreeWiseMen, where SELECT * FROM ThreeWiseMen would produce the following output

Subject Predicate Object
1 Name Bertie Ahern
2 Name George Bush
3 Name Tony Blair
1 Country Ireland
2 Country USA
3 Country Great Britain
1 TV Sopranos
2 TV The West Wing
1 Music Westlife
2 Music Van Morrison
2 Fav Team Dallas Cowboys
3 Fav Team Newcastle United

To Create and populate the ThreeWiseMen table:-

Create table dbo.ThreeWiseMen
(
 Subject varchar(50),
 Predicate varchar(50),
 Object varchar(50)
)
go

Insert into dbo.ThreeWiseMen (Subject, Predicate, Object)
 values (1, 'Name', 'Bertie Ahern')
Insert into dbo.ThreeWiseMen (Subject, Predicate, Object)
 values (2, 'Name', 'George Bush')
Insert into dbo.ThreeWiseMen (Subject, Predicate, Object)
 values (3, 'Name', 'Tony Blair')
Insert into dbo.ThreeWiseMen (Subject, Predicate, Object)
 values (1, 'Country', 'Ireland')
Insert into dbo.ThreeWiseMen (Subject, Predicate, Object)
 values (2, 'Country', 'USA')
Insert into dbo.ThreeWiseMen (Subject, Predicate, Object)
 values (3, 'Country', 'Great Britain')
Insert into dbo.ThreeWiseMen (Subject, Predicate, Object)
 values (1, 'TV', 'Sopranos')
Insert into dbo.ThreeWiseMen (Subject, Predicate, Object)
 values (2, 'TV', 'The West Wing')
Insert into dbo.ThreeWiseMen (Subject, Predicate, Object)
 values (1, 'Music', 'Westlife')
Insert into dbo.ThreeWiseMen (Subject, Predicate, Object)
 values (2, 'Music', 'Van Morrison')
Insert into dbo.ThreeWiseMen (Subject, Predicate, Object)
 values (2, 'Fav Team', 'Dallas Cowboys')
Insert into dbo.ThreeWiseMen (Subject, Predicate, Object)
 values (3, 'Fav Team', 'Newcastle United')
go

Now Lets Say We require output in the following format: -

Name Country TV Music Fav Team
Value Value Value Value Value

One way we could achive this is to use Case statements, though as you can begin to see in the following example, this could potentially cause alot of duplication the more unique predicates that are added over time.


SELECT
 max(case when Predicate = 'Name' then Object end)
  as [Name],
 max(case when Predicate = ‘Country’ then Object end)
  as [Country],
 max(case when Predicate = ‘TV’ then Object end)
  as [TV],
 max(case when Predicate = ‘Music’ then Object end)
  as [Music],
 max(case when Predicate = ‘Fav Team’ then Object end)
  as [Fav Team]
FROM DBO.ThreeWiseMen
GROUP BY Subject

A better approach would be the use of the Pivot Operator


SELECT
 max([Name]) as [Name],
 max([Country]) as [Country],
 max([TV]) as [TV],
 max([Music]) as [Music],
 max([Fav Team]) as [Fav Team]
FROM DBO.ThreeWiseMen
 PIVOT( max(Object) for Predicate in ([Name],[Country],
  [TV],[Music],[Fav Team])) AS piv
GROUP BY Subject

Pivot groups the records based on the fields that are not used in the pivot statement, which is Subject in this example, and thus the output will have a single record for each unique Subject value. Five columns will also be created from the Predicate values (Name, Country, TV, Music, Fav Team) and each row will now contain the Object value for that Predicate. Max is used to satisfy the requirement of an aggregiate expression in the pivot statement.

The resulting table now looks like the following:-

Name Country TV Music Fav Team
Bertie Ahern Ireland Sopranos Westlife NULL
George Bush USA The West Wing Van Morrison Dallas Cowboys
Tony Blair Great Britain NULL NULL Newcastle United

As you can see the Pivot statement is alot simplier to read and maintain, though it does have its limitations like the hard coded values within the ‘IN’ clause. One way around this is to use dynamic SQL to create a string to Pivot on the Predicate column values from the table as shown below.

T-Sql Pivot

DECLARE @Columns varchar(200)
DECLARE @ColNames varchar(200)
SET @Columns = ''
SET @ColNames = ''

SELECT @Columns = @Columns + '[' + Predicate + '],’,
@ColNames = @ColNames + ‘max([' + Predicate + ']) as [' + Predicate + '],’
FROM (SELECT Distinct Predicate from DBO.ThreeWiseMen) cols

– Remove the trailing comma
SET @Columns = LEFT(@Columns, LEN(@Columns) - 1)
SET @ColNames = LEFT(@ColNames, LEN(@ColNames) - 1)

EXEC( ‘SELECT ‘ + @ColNames + ‘
FROM DBO.ThreeWiseMen
 PIVOT ( MAX(Object) for Predicate in(’ + @Columns + ‘)) AS piv
GROUP BY Subject’ )

UNPIVOT
You are much less likely to use the Unpivot operator over the Pivot operator but its no harm to give it a mention here. It basically performs the opposite of Pivot by turning column values into rows as you can see in the resulting table.

SELECT Predicate, Object
FROM (SELECT
 max([Name]) as [Name],
 max([Country]) as [Country],
 max([TV]) as [TV],
 max([Music]) as [Music],
 max([Fav Team]) as [Fav Team]
FROM DBO.ThreeWiseMen
 PIVOT(max(Object) for Predicate in ([Name],[Country],
 [TV],[Music],[Fav Team])) piv
GROUP BY Subject) pvt
UNPIVOT
 (Object FOR Predicate IN
  ([Name],[Country],[TV],[Music],[Fav Team])) unpiv

Resulting Unpivot Table

Predicate Object
Name Bertie Ahern
Country Ireland
TV Sopranos
Music Westlife
Name George Bush
Country USA
TV The West Wing
Music Van Morrison
Fav Team Dallas Cowboys
Name Tony Blair
Country Great Britain
Fav Team Newcastle United

As the ThreeWiseMen example shows Pivot’s are perfect for situations where requirement change and new columns are constantly being added. These new Columns can now simply be added to the table dynamically as a new row and without affecting the database structure. In saying that it does have its limitations. You can only aggregrate one field, in CASE statements you can aggregrate as many as you like,it would also be cool if you could embed a select statement within the Pivot’s ‘IN’ clause and say something like PIVOT(max(Object) for Predicate in (SELECT Distinct Predicate from DBO.ThreeWiseMen)).

The Pivot operator is no doubt a powerfull function, and a welcome addition to SQL Server 2005, but the CASE statement still has it’s place in the SQL Toolkit, and my opinion it is still one of the most under-used SQL commands.