Returning vertical data in one horizontal column

Having data in a normalized form is great. It helps eliminate redundant data, and therefore helps save space in the database. In this blog post, I want to discuss how you can query a normalized database to return denormalized data – that is, return data from many rows in a single column. I’ve used this technique a number of times, and below a couple of the ways I’ve put it to use.

  • At my current company we have a custom service which runs on a nightly basis and sends emails for various reasons. Some emails are sent to the user in question, but also copies in their manager(s). If the user has multiple managers, we need to put both of their email addresses into a single “CC” column, separated by a semicolon, as well as the employees email address in the “To” column.
  • Over the last 6 or so months, I’ve been building my company a Holiday & Sick Day management tool in ASP.Net & C# as a means to teach myself a coding language. In the tool, a manager can view sick days on a calendar by hovering over a highlighted day (indicating there was a sickness on that day), and a tool tip will appear with each person who was ill on that day. The tool tip came straight from a stored procedure in one column, making it much easier for me to utilize in C#.

Now that I’ve given you a small taster of what it’s used for, I should probably show you some SQL!

Using the second example given above, imagine we have the below query:

SELECT S.SicknessDate,
	   P.Name   
 
FROM Person P
	 JOIN Sickness S ON P.PersonId = S.PersonId

Which gives us this dataset:

Returning vertical data in one horizontal column

 

 

 

Great, we have data! But, we only want 2 rows, one for ’2012-01-01′ containing Zach Stagers and Some Guy, and another for ’2012-03-01′ for Other Fella.

How do we do that? Through the use of a correlated subquery making use of FOR XML PATH and the [text()] column name. Here’s the new code:

SELECT DISTINCT 
	   S.SicknessDate,
	   (
		SELECT Name + ', ' [text()]
 
		FROM Person P1
			 JOIN Sickness S1 ON P1.PersonId = S1.PersonId
 
		WHERE S1.SicknessDate = S.SicknessDate
 
		FOR XML PATH('')
	   ) [Names]
 
FROM Person P
	 JOIN Sickness S ON P.PersonId = S.PersonId

And the new, easier to utilize, data set:

Returning vertical data in one horizontal column_FIXED

 

 

 
Perfecto! Just what we wanted! The FOR XML PATH subquery kindly does all of the work for us, and the distinct gets rid of duplicate rows caused when multiple people are ill on the same day. In the subquery, we use where the sickness date of the inner query equals the sickness date of the outer query, turn the result into XML (FOR XML PATH(”)), and tell it to just output it as plain text ([text()]).

I hope you find as useful an implementation for this as I have!

Digg This
Reddit This
Stumble Now!
Buzz This
Vote on DZone
Share on Facebook
Bookmark this on Delicious
Kick It on DotNetKicks.com
Shout it
Share on LinkedIn
Bookmark this on Technorati
Post on Twitter
Google Buzz (aka. Google Reader)

Stored Procedure Definition Search

I just re-discovered this little script in my “Magic SQL” folder, and thought I’d share it.

Below is a script to search the definition of all stored procedures in a database for a specified value – useful in circumstances when you’re trying to figure out if a particular column is used anywhere in procedure code, or something like that.

USE DatabaseName
 
DECLARE @SearchString VARCHAR(500)
SET @SearchString = 'Search'
 
SELECT DISTINCT
       Name
 
FROM sysobjects SO
     JOIN syscomments SC ON SO.id = SC.id
 
WHERE SO.[type] = 'P'
      AND SO.category = 0
      AND SC.[text] LIKE '%' + @SearchString + '%'

To use the script, change the USE statement to point at the database you’d like to search, and change the value of @SearchString to the value you’d like to search for.

The result set will provide the name of any procedures containing the value defined within @SearchString.

Digg This
Reddit This
Stumble Now!
Buzz This
Vote on DZone
Share on Facebook
Bookmark this on Delicious
Kick It on DotNetKicks.com
Shout it
Share on LinkedIn
Bookmark this on Technorati
Post on Twitter
Google Buzz (aka. Google Reader)

Returning Tabular Data as XML with FOR XML PATH and Nested XML Queries

FOR XML PATH is the favorite amongst most Database Developers due to its ease of use, and the fact that you can build almost any XML structure with it.

I’ll also be discusing nested XML Queries in this post, which kind of goes hand in hand with FOR XML PATH. More on this later.

I’ll start off with an example, then explain what’s going on afterwards:

SELECT P.Name AS "@Name",
       P.Age AS "@Age",
       'PersonId = ' + CAST(P.PersonId AS VARCHAR(5)) AS "comment()",
       (
        SELECT O.OrderId AS "@OrderId",
               O.Item AS "@Item"
	FROM Orders O
	WHERE O.PersonId = P.PersonId
	FOR XML PATH('Order'), TYPE
       ) 
 
FROM Person P
 
WHERE P.PersonId IN (1, 2)
 
FOR XML PATH('Person'), ROOT('People'), ELEMENTS XSINIL

As with each of the other XML Modes (Raw, Auto, and Explicit), you’re able to change the name of your elements from the default. To achieve this use FOR XML PATH(‘Person’), this will give your rows the name “Person”.

ROOT(‘People’) specifies the root element to be wrapped around you’re XML. The keyword ELEMENTS in this instance doesn’t do a whole lot as the structure is placed into elements and attributes depending on the names given to the columns, but XSINIL cannot be specified without it. XSINIL is used to give you’re XML a namespace, as well as to handle NULL values.

The actual output of the query used aboved is as below:

<People xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  <Person Name="Zach Stagers" Age="22">
    <!--PersonId = 1-->
    <Order OrderId="1" Item="Bike" />
    <Order OrderId="2" Item="Laptop" />
    <Order OrderId="3" Item="Phone" />
    <Order OrderId="4" Item="Keyboard" />
    <Order OrderId="5" Item="Mouse" />
    <Order OrderId="10" Item="Pizza" />
  </Person>
  <Person Name="Libbie Coulson" Age="24">
    <!--PersonId = 2-->
  </Person>
</People>

The first two columns in the select are specified as “@Name” and “@Age”, this assigns them as attributes to the Person Element.

Specifying the column name as “comment()” makes the value a comment within the XML, note that this is case sensitive – “comment()” is not the same as “Comment()”.

One column name that I haven’t used in the example is “text()”, this sets the value as the text of the element, instead of an attribute.. for example, using this against P.Name turns the value into:

<Person Age="22">Zach Stagers</Person>

Nested XML Queries

Nested XML Queries are what gives the FOR XML PATH mode its structure. To do this, as in the example above, you would use a correlated subquery. Here, you link the subquery to the outer query on the PersonId – linking the person to their orders, and therefore listing the orders beneath the correct person.

The subqueries FOR XML statement must have TYPE specified. This translates the queries results into true XML, Microsoft did this for backwards compatibility reasons.

Digg This
Reddit This
Stumble Now!
Buzz This
Vote on DZone
Share on Facebook
Bookmark this on Delicious
Kick It on DotNetKicks.com
Shout it
Share on LinkedIn
Bookmark this on Technorati
Post on Twitter
Google Buzz (aka. Google Reader)

Returning Tabular Data as XML with FOR XML EXPLICIT

FOR XML EXPLICIT can be a little awkward to use to say the least, but it has it uses. The good thing about EXPLICIT, is that you can mold it into almost any XML structure you like.

I actually found that reviewing the results as tabular data is what really helped me understand what the FOR XML EXPLICIT query is doing.

Below is the example code that I’m using:

SELECT
	1 AS Tag,
	NULL AS Parent,
	Name AS "Person!1!Name",
	PersonId AS "Person!1!PersonId",
	NULL AS "Orders!2!Item"
FROM Person
 
UNION ALL
 
SELECT
	2 AS Tag,
	1 AS Parent,
	NULL AS "Person!1!Name",
	PersonId AS "Person!1!PersonId",
	Item AS "Orders!2!Item"
 
FROM Orders
 
ORDER BY "Person!1!PersonId"
 
FOR XML EXPLICIT, ROOT('People')

The first thing to note with this query, is the two columns “tag” and “parent”. You must specify these columns in order to use FOR XML EXPLICIT. These columns specify the hierarchy of your results. The first query will have a “tag” of 1, and a NULL “parent”, as it is the first object declared. Each subsequent “tag” and “parent” will then go up by increments of 1.

Another thing to note, is the column names. The value listed before the first exclamation in the column name specifies the Element Name. The value listed before the second exclamation is the “tag” that it relates to. Then the attribute name is specified. ElementName!TagNumber!AttributeName.

Lastly, it’s the ORDER BY “Person!1!PersonId” that really gives the final result its correct structure.

The XML outputted by the query specified above is as follows:

<People>
  <Person Name="Zach Stagers" PersonId="1">
    <Orders Item="Bike" />
    <Orders Item="Laptop" />
    <Orders Item="Phone" />
    <Orders Item="Keyboard" />
    <Orders Item="Mouse" />
    <Orders Item="Pizza" />
  </Person>
  <Person Name="Libbie Coulson" PersonId="2" />
</People>

As with RAW and AUTO, the ROOT() simply specifies the name of the root element. This is optional.

Without the aforementioned ORDER BY clause, all of the orders for person “Zach” would appear under “Libbie” as she is last in the list, and the UNION ALL will just tack the results from the second query to the end of the result. Ordering by the key between the two tables ensures that the orders against PersonId 1 appear under the person with PersonId 1.

As I mentioned earlier in this post, what really helped me understand what was happening with FOR XML EXPLICIT was seeing the results in tabular form. To do this, simply remove the FOR XML line and re-run the code. In the below tabular example, I’ve removed FOR XML EXPLICIT, ROOT(‘People’) from the query:

FOR XML EXPLICIT Tabular

Digg This
Reddit This
Stumble Now!
Buzz This
Vote on DZone
Share on Facebook
Bookmark this on Delicious
Kick It on DotNetKicks.com
Shout it
Share on LinkedIn
Bookmark this on Technorati
Post on Twitter
Google Buzz (aka. Google Reader)

SQL If Else Clause

Frequently used in Stored Procedures, the SQL If can be used to help control the flow and execution of your SQL.

The If checks to see if a boolean condition is true, and if it is, the code directly beneath is executed. This is handy, as we often need to execute completely different code depending on the value of a variable.

The syntax is easy to use, but can make a big difference to your code:

SELECT @Value = COUNT(*) FROM SpecialOffers WHERE Valid = 1
 
IF (@Value > 0)
  BEGIN
 
    SELECT * FROM SpecialOffersProducts
 
  END
ELSE
  BEGIN
 
    SELECT * FROM Products
 
  END

The above code is counting the number of valid special offers and assigning the number to the variable @Value.

The If then checks that @Value is greater than zero, meaning there are valid special offers available. If the check is true, and there are special offers available, our SpecialOffersProducts will be returned. If the count returns as zero, the else will be executed, and the Products will be returned.

The above example would typically be used within a stored procedure.

Multiple conditions can be specified within the If with the use of the AND and OR keywords.

Digg This
Reddit This
Stumble Now!
Buzz This
Vote on DZone
Share on Facebook
Bookmark this on Delicious
Kick It on DotNetKicks.com
Shout it
Share on LinkedIn
Bookmark this on Technorati
Post on Twitter
Google Buzz (aka. Google Reader)

Changing Data Types – Cast and Convert

The CAST and CONVERT System Functions essentially achieve the same task, with a few small difference.

CAST is the ANSI compliant version of the two. CAST can change data from one type to another, but cannot change the format of a date. It’s simple to use, and it’s syntax is as follows:

CAST(Expression AS [Target Data Type])

Where I’ve written “Expression”, you would replace with either a column name, or a hardcoded value. “Target Data Type” would be replaced with the name of the Data Type that you’d like to convert the value to.

CONVERT works in a similar manner, but has one more optional value available and is not ANSI standard.

CONVERT([Target Data Type], Expression, Style)

As above, Target Data Type is the data type that you’d like to convert to, and Expression is the column or hardcoded value. You’ll notice that this example also includes Style. This option is available when working with dates. For example, most of the time dates are stored as a DateTime which would look like “2001-01-01 00:00:00:000″. If you wanted to exclude the time and display the date as “01 Jan 2001″ you’d use:

CONVERT (VARCHAR(11), '2001-01-01 00:00:00:000', 106)

Below is a table of some of the more common Date Styles used, and an example of how the output will look. All of the examples use Jan 31 2012.

Style Number Date Output Example
101 01-31-2012 (MM/DD/YYYY)
103 31-01-2012 (DD/MM/YYYY)
106 31 Jan 2012 (DD MON YYYY)
101 Jan 31, 2012 (MON DD, YYYY)

Not all data types can be converted or cast to all data types, for example a Bit cannot be converted to an Image.

Digg This
Reddit This
Stumble Now!
Buzz This
Vote on DZone
Share on Facebook
Bookmark this on Delicious
Kick It on DotNetKicks.com
Shout it
Share on LinkedIn
Bookmark this on Technorati
Post on Twitter
Google Buzz (aka. Google Reader)

SQL Joins – Left, Right, Inner, Full, and Cross!

SQL Joins are incredibly useful, and knowing the difference between the five different types makes harnessing their great power a lot easier… obviously!

Joins are used to join tables together in a query, so that you can use data from both tables. You are able to join to a maximum of 256 tables, which frankly, should be more than enough!

The syntax goes like this:

SELECT *
FROM Person AS P INNER JOIN
     Orders AS O ON P.PersonId = O.PersonId

The above example uses the most common of all joins, the INNER JOIN. The word “Inner” can actually be omitted from the query, as it isn’t necessary.

The Inner Join will join the two tables together on the linking columns, and only the results where a match is found in both tables will be returned.

Next, we have the OUTER JOIN’s (Left, Right, and Full). These are just as simple to use as the INNER join, but work a little differently. The Synax is the same as the above INNER join example, just replace “INNER JOIN” with “LEFT OUTER JOIN” or Right / Full, depending on what you want to achieve.

Again, with the OUTER Join’s, the word “Outer” can be omitted.

The way the LEFT JOIN works is by selecting all of the data from the table on the left, and returns all of the matchs from the table on the right. If there isn’t a match, then a NULL will be returned by the right table. Right Join is exactly the same, just reversed (returns everything on the right, and all possible matches on the left).

FULL JOIN will return every row from the left and right tables, matching where possible.

And lastly we have CROSS JOIN. This join will match every row from the left table, to every row in the right table. This means that if the left table has 100 rows, and the right table also has 100 rows, our result set would be a grand total of 10,000 rows! Can I think of an example where you would use this? No.

Hopefully the below simple diagram will help re-enforce what I’ve said above.

Joins Diagram

Joins Diagram

Digg This
Reddit This
Stumble Now!
Buzz This
Vote on DZone
Share on Facebook
Bookmark this on Delicious
Kick It on DotNetKicks.com
Shout it
Share on LinkedIn
Bookmark this on Technorati
Post on Twitter
Google Buzz (aka. Google Reader)

Returning Tabular Data as XML with FOR XML AUTO

About a week ago I posted on FOR XML RAW, this time it’s FOR XML AUTO‘s turn.

The two are actioned via very similar syntax, but operate slightly differently. Auto is probably named “Auto” because it performs a lot of the XML node naming for you. There are also a couple of other differences which I’ll explain in this post.

With FOR XML AUTO, you don’t have the ability to name your row’s like you do in RAW. This is because their names are derived from the aliases given to the tables in your query. For example, having a table named Person, and using FOR XML AUTO against this table would give your rows a name of “Person”. This is the same as using FOR XML RAW(‘Person’), see below for a literal example of this:

SELECT * FROM Person
FOR XML AUTO
SELECT * FROM Person
FOR XML RAW('Person')

Both of the above examples would return the below XML:

<Person PersonId="1" Name="Zach Stagers" Age="22" />

As with RAW, you can specify your root element in AUTO:

SELECT * FROM Person
FOR XML AUTO, ROOT('People')

The above would return the exact same XML data, but with the specified root element wrapped around it.

<People>
 <Person PersonId="1" Name="Zach Stagers" Age="22" />
</People>

The attributes of the person row can further be split out into their own elements by specifying ELEMENTS, and can also be given a namespace and the ability to handle SQL NULL‘s by specifying XSINIL

SELECT * FROM Person
FOR XML AUTO, ROOT('People'), ELEMENTS, XSINIL

The above example would return:

<People xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
 <Person>
  <PersonId>1</PersonId>
  <Name>Zach Stagers</Name>
  <Age>22</Age>
 </Person>
</People>

There are a couple of other things to remember when using FOR XML AUTO, and that is that the order of your XML elements is down the order in which your columns are specified. For example, executing the below two queries would give you the same information, but using them with FOR XML AUTO will give you completely different XML:

SELECT Person.Name,
       Orders.Item
 
FROM Person 
     JOIN Orders ON Orders.PersonId = Person.PersonId
 
FOR XML AUTO, ROOT('PeopleWithOrders')

Returns:

<PeopleWithOrders>
 <Person Name="Zach Stagers">
  <Orders Item="Bike" />
  <Orders Item="Laptop" />
  <Orders Item="Phone" />
 </Person>
</PeopleWithOrders>
SELECT Orders.Item,
       Person.Name
 
FROM Person 
     JOIN Orders ON Orders.PersonId = Person.PersonId
 
FOR XML AUTO, ROOT('PeopleWithOrders')

The above query, with the columns swapped over will return:

<PeopleWithOrders>
 <Orders Item="Bike">
  <Person Name="Zach Stagers" />
 </Orders>
 <Orders Item="Laptop">
  <Person Name="Zach Stagers" />
 </Orders>
 <Orders Item="Phone">
  <Person Name="Zach Stagers" />
 </Orders>
</PeopleWithOrders>

So, you just need to remember that the order of the columns is what specifies the output of your XML, not the order of the tables in the FROM clause. The names given to the tables, however, specifies the names given to the XML nodes.

Digg This
Reddit This
Stumble Now!
Buzz This
Vote on DZone
Share on Facebook
Bookmark this on Delicious
Kick It on DotNetKicks.com
Shout it
Share on LinkedIn
Bookmark this on Technorati
Post on Twitter
Google Buzz (aka. Google Reader)

Recursive CTEs

Recursive Common Table Expressions allow you to perform a recursive query, without the use of a resource intensive Cursor.

In this post, I’ll show you how to write a very simple Recursive CTE, and explain how it works.

The example that I’ll be working with in this post is as below:

;WITH RCTE AS
       (
        SELECT 1 AS [Recursion Level], 'A' [Type]
 
        UNION ALL
 
        SELECT [Recursion Level] +1 AS [Recursion Level], 'R' [Type]
        FROM RCTE
        WHERE [Recursion Level] < 5
       )
 
SELECT * FROM RCTE
OPTION(MAXRECURSION 4)

This example obviously has no practical use, but it’s good for showing how a Recursive CTE works.

The query starts in the exact same way as a normal CTE, ;WITH RCTE AS. The semicolon terminates any SQL defined before the CTE, and it’s best practice to add a semicolon here for that reason. If any preceding code (Commonly the declaration of variables) isn’t terminated by a semicolon, an error will occur.

The second line, SELECT 1 AS [Recursion Level], ‘A’ [Type] is known as the “Anchor Member”. This query defines the base level item that you would like to build upon. This simple select just creates a single record with two columns.

UNION ALL joins the two parts of the recursive query together, and must be defined after your anchor query.

The query defined after the UNION ALL is known as the “Recursive Member”. This query must reference the CTE in its FROM clause. This is so the query knows what base values it has, and can therefore build upon. The where clause, WHERE [Recursion Level] < 5, is a terminator statement, which stops the query from looping indefinitely. The CTE is then closed off with a closing a bracket.

A simple SELECT * FROM RCTE is all that is needed to pull back all of the required information from the CTE in this example.

OPTION(MAXRECURSION 4) is essentially another safety net to stop the query from spiralling into an infinite loop. This stops the query from running at a maximum of 4 recursions, but be aware that if 4 recursions isn't enough to extract all of the data within the loop - an error will occur. For example, If I changed my WHERE clause to less than 6 in the Recursive Member, and left my Max Recursion option at 4 I would recieve the following error:

The statement terminated. The maximum recursion 4 has been exhausted before statement completion.

The Max Recursion clause is optional.

The result set from the example query would like the below:

Recursive CTE Result

Recursive CTE Result

As you can see, 5 results were returned – even though WHERE [Recursion Level] < 5 was specified in the Recursive Member, and OPTION(MAXRECURSION 4) is defined in the outer query. This is because the Anchor Member (First record in the result set, type “A”) doesn’t count as a recursion. There are, however, 4 recursion records (type “R”).

A more practical use of a Recursive CTE would be to build something like a family tree, or a companies employee structure.

Digg This
Reddit This
Stumble Now!
Buzz This
Vote on DZone
Share on Facebook
Bookmark this on Delicious
Kick It on DotNetKicks.com
Shout it
Share on LinkedIn
Bookmark this on Technorati
Post on Twitter
Google Buzz (aka. Google Reader)

Case Statements

Case Statements are used to output different data depending on the value returned from a column:

CASE
 WHEN Column1 = 'Seven' THEN 7
 WHEN Column1 = 'Eight' THEN 8
 ELSE 0
 END AS [Number]

The above syntax would be used within a select statement, and would be placed along side all other columns in your column list. This example would check all data returned from Column1, if it finds the word ‘Seven’ then the number 7 will be returned in its place, the same goes for ‘Eight’ and 8. Anything that does not equal ‘Seven’ or ‘Eight’ will be returned as a 0.

As many WHEN lines can be written as are necessary.

It’s possible to nest Case statements up to a maximum of 10 levels deep.

Case Statements can also be used within Where clauses, which is particulary useful within a parameterized stored procedure.

For example, if you had a parameter of @Cost, then your Where clause may look like this:

WHERE CASE
       WHEN @Cost = 'All' THEN 1
       WHEN @Cost = 'None' AND Column1 IS NULL THEN 1
       WHEN @Cost NOT IN ('All', 'None') AND Column1 = CONVERT(INT, @Cost) THEN 1
       ELSE 0
       END = 1

The above example is checking the value of our Varchar parameter @Cost, then performing any search functionality that goes along with it.

When @Cost is All, there is no other search critiria written as we want to return everything. When @Cost is None, then we only want to return columns with a NULL entry, so we check that @Cost is ‘None’ and our Column1 IS NULL.

When we want to work with the @Cost parameter as an INT, we have to explicitly define that it’s not All or None, then specify our search “Column1 = Covert(INT, @Cost)”. If we do not explicitly define that we are no longer searching for All or None within @Cost, an error will occur for trying to compare an Int and a Varchar value.

The last use of the CASE statement that I’d like to mention is within an UPDATE statement. You can use the CASE to define what value should be set depending on the value of another column, for example:

UPDATE Table1
SET Column1 = CASE 
               WHEN Column2 = 'Seven' THEN 7
               WHEN Column2 = 'Eight' THEN 8
               END
WHERE Column1 IS NULL
Digg This
Reddit This
Stumble Now!
Buzz This
Vote on DZone
Share on Facebook
Bookmark this on Delicious
Kick It on DotNetKicks.com
Shout it
Share on LinkedIn
Bookmark this on Technorati
Post on Twitter
Google Buzz (aka. Google Reader)