Having data in a normalized form is great. It helps eliminate redundant data, and therefore helps save space in the database. In this blog post, I want to discuss how you can query a normalized database to return denormalized data – that is, return data from many rows in a single column. I’ve used this technique a number of times, and below a couple of the ways I’ve put it to use.
- At my current company we have a custom service which runs on a nightly basis and sends emails for various reasons. Some emails are sent to the user in question, but also copies in their manager(s). If the user has multiple managers, we need to put both of their email addresses into a single “CC” column, separated by a semicolon, as well as the employees email address in the “To” column.
- Over the last 6 or so months, I’ve been building my company a Holiday & Sick Day management tool in ASP.Net & C# as a means to teach myself a coding language. In the tool, a manager can view sick days on a calendar by hovering over a highlighted day (indicating there was a sickness on that day), and a tool tip will appear with each person who was ill on that day. The tool tip came straight from a stored procedure in one column, making it much easier for me to utilize in C#.
Now that I’ve given you a small taster of what it’s used for, I should probably show you some SQL!
Using the second example given above, imagine we have the below query:
SELECT S.SicknessDate, P.Name FROM Person P JOIN Sickness S ON P.PersonId = S.PersonId
Which gives us this dataset:
Great, we have data! But, we only want 2 rows, one for ’2012-01-01′ containing Zach Stagers and Some Guy, and another for ’2012-03-01′ for Other Fella.
SELECT DISTINCT S.SicknessDate, ( SELECT Name + ', ' [text()] FROM Person P1 JOIN Sickness S1 ON P1.PersonId = S1.PersonId WHERE S1.SicknessDate = S.SicknessDate FOR XML PATH('') ) [Names] FROM Person P JOIN Sickness S ON P.PersonId = S.PersonId
And the new, easier to utilize, data set:
Perfecto! Just what we wanted! The FOR XML PATH subquery kindly does all of the work for us, and the distinct gets rid of duplicate rows caused when multiple people are ill on the same day. In the subquery, we use where the sickness date of the inner query equals the sickness date of the outer query, turn the result into XML (FOR XML PATH(”)), and tell it to just output it as plain text ([text()]).
I hope you find as useful an implementation for this as I have!