1. Pivoting data
Itzik Ben-Gan describes, "pivoting data is a technique that rotates data from a state of rows to a state of columns" in his article, 'Pivoting Data' (SQL Server Magazine, January 2011). Why do we ever have to rotate data then? Data pivoting allows us to summarize(aggregate) a given data set and re-organize(pivot) the structure of the original data set for a better report purposes.
2. Creating a pivot query
Let's say we have a table, called 'SalesOrderHeader.' And we need to summarize the data and horizontally show SUM of TotalDue of each YEAR for all the TerritoryIDs.
SELECT TerritoryID
, TotalDue
, YEAR(OrderDate) OrderYear
FROM Sales.SalesOrderHeader
The report is supposed to summarize the data as following
3. A few things to identify before writing a PIVOT query.
- The data is to be grouped by TerritoryID (Grouping element)
- TotalDue column is to be aggregated (Aggregation element)
- The distinct values of YEAR(OrderDate) are used as the column headings of the pivoted result (Pivoted element. To be displayed horizontally.). And we need to know the distinct values of YEAR(OrderDate) in advance in order to write our Pivot query. i.e) 2005, 2006, 2006, 2007
4. Writing a pivot query in SQL Server
4.1) SQL Server with CASE expression.
SELECT TerritoryID,
SUM (CASE WHEN YEAR(OrderDate) = 2005 THEN TotalDue ELSE 0 END) as [2005],
SUM (CASE WHEN YEAR(OrderDate) = 2006 THEN TotalDue ELSE 0 END) as [2006],
SUM (CASE WHEN YEAR(OrderDate) = 2007 THEN TotalDue ELSE 0 END) as [2007],
SUM (CASE WHEN YEAR(OrderDate) = 2008 THEN TotalDue ELSE 0 END) as [2008]
FROM Sales.SalesOrderHeader
GROUP BY TerritoryID
ORDER BY TerritoryID;
4.2) SQL Server with PIVOT operator
WITH BaseTable AS
(
SELECT TerritoryID
, TotalDue
, YEAR(OrderDate) OrderYear
FROM Sales.SalesOrderHeader
)
SELECT TerritoryID
, [2005], [2006], [2007], [2008]
FROM BaseTable
PIVOT ( SUM(TotalDue) FOR OrderYear IN ([2005],[2006],[2007],[2008])) as PvtResult ;
5. Things to remember
- When using PIVOT operator, creating a base table using CTE is preferred. Include the columns needed only: a grouping column, pivoted column and aggregated column. It is because any column that is not pivoted or aggregated will be used as a grouping element.
- IN list restricts the rows that are pivoted and supplies the pivoted column names. So if you omit [2007] from the above query, summation on TotalDue for year 2007 won't be done even if there are some data for year 2007.
- IN list is NOT DYNAMIC. We can not use a sub-query to populate this IN list. It should be hard-coded.
(If the IN list is unknown, we can create a dynamic SQL statement. Dynamic pivoting is beyond the scope of this post. )
---------------------------------------------------
Reference:
- Querying Microsoft SQL Server 2012 Training Kit by Itzik Ben-Gan, Dejan Sarka, Ron Talmage http://technet.microsoft.com
- http://sqlmag.com/t-sql/create-pivoted-tables-3-steps "Create Pivoted Tables in 3 Steps" by Kathi Kellenberger
No comments:
Post a Comment