Aug 7, 2011

Data Pivoting - how to create a pivot query

1.  Pivoting data
Itzik Ben-Gan describes, "pivoting data is a technique that rotates data from a state of rows to a state of columns" in his article, 'Pivoting Data' (SQL Server Magazine, January 2011). Why do we ever have to rotate data then?  Data pivoting allows us to summarize(aggregate) a given data set and re-organize(pivot) the structure of the original data set for a better report purposes.

2.  Creating a pivot query   
Let's say we have a table, called 'SalesOrderHeader.'  And we need to summarize the data and horizontally show SUM of TotalDue of each YEAR for all the TerritoryIDs.

 SELECT     TerritoryID
                   , TotalDue
                  , YEAR(OrderDate) OrderYear
  FROM      Sales.SalesOrderHeader






















The report is supposed to summarize the data as following












3. A few things to identify before writing a PIVOT query.
- The data is to be grouped by TerritoryID (Grouping element)
TotalDue column is to be aggregated (Aggregation element)
- The distinct values of YEAR(OrderDate) are used as the column headings of the pivoted result (Pivoted element. To be displayed horizontally.). And we need to know the distinct values of YEAR(OrderDate) in advance in order to write our Pivot query.  i.e) 2005, 2006, 2006, 2007

4. Writing a pivot query in SQL Server
4.1) SQL Server with CASE expression.

 SELECT     TerritoryID,
SUM (CASE WHEN YEAR(OrderDate) = 2005  THEN TotalDue ELSE 0 END) as [2005],
SUM (CASE WHEN YEAR(OrderDate) = 2006  THEN TotalDue ELSE 0 END) as [2006],
SUM (CASE WHEN YEAR(OrderDate) = 2007  THEN TotalDue ELSE 0 END) as [2007],
SUM (CASE WHEN YEAR(OrderDate) = 2008  THEN TotalDue ELSE 0 END) as [2008]
       FROM     Sales.SalesOrderHeader
       GROUP BY    TerritoryID
       ORDER BY    TerritoryID;

4.2) SQL Server with PIVOT operator

WITH BaseTable AS
 (
    SELECT     TerritoryID
                      , TotalDue
                      , YEAR(OrderDate) OrderYear
     FROM      Sales.SalesOrderHeader
  )
  SELECT    TerritoryID
    , [2005], [2006], [2007], [2008]
FROM    BaseTable
PIVOT  ( SUM(TotalDue)  FOR  OrderYear  IN ([2005],[2006],[2007],[2008]))  as PvtResult ;

5. Things to remember
- When using PIVOT operator, creating a base table using CTE is preferred. Include the columns needed only: a grouping column, pivoted column and aggregated column. It is because any column that is not pivoted or aggregated will be used as a grouping element.
-  IN list restricts the rows that are pivoted and supplies the pivoted column names. So if you omit [2007] from the above query, summation on TotalDue for year 2007 won't be done even if there are some data for year 2007.
- IN list is NOT DYNAMIC. We can not use a sub-query to populate this IN list. It should be hard-coded.
  (If the IN list is unknown, we can create a dynamic SQL statement. Dynamic pivoting is beyond the scope of this post. )
---------------------------------------------------
Reference: 
-  Querying Microsoft SQL Server 2012 Training Kit by Itzik Ben-Gan, Dejan Sarka, Ron Talmage     http://technet.microsoft.com
-  http://sqlmag.com/t-sql/create-pivoted-tables-3-steps  "Create Pivoted Tables in 3 Steps" by Kathi Kellenberger

No comments:

Post a Comment