Returning Tabular Data as XML with FOR XML AUTO

About a week ago I posted on FOR XML RAW, this time it’s FOR XML AUTO‘s turn.

The two are actioned via very similar syntax, but operate slightly differently. Auto is probably named “Auto” because it performs a lot of the XML node naming for you. There are also a couple of other differences which I’ll explain in this post.

With FOR XML AUTO, you don’t have the ability to name your row’s like you do in RAW. This is because their names are derived from the aliases given to the tables in your query. For example, having a table named Person, and using FOR XML AUTO against this table would give your rows a name of “Person”. This is the same as using FOR XML RAW(‘Person’), see below for a literal example of this:

SELECT * FROM Person
FOR XML AUTO
SELECT * FROM Person
FOR XML RAW('Person')

Both of the above examples would return the below XML:

<Person PersonId="1" Name="Zach Stagers" Age="22" />

As with RAW, you can specify your root element in AUTO:

SELECT * FROM Person
FOR XML AUTO, ROOT('People')

The above would return the exact same XML data, but with the specified root element wrapped around it.

<People>
 <Person PersonId="1" Name="Zach Stagers" Age="22" />
</People>

The attributes of the person row can further be split out into their own elements by specifying ELEMENTS, and can also be given a namespace and the ability to handle SQL NULL‘s by specifying XSINIL

SELECT * FROM Person
FOR XML AUTO, ROOT('People'), ELEMENTS, XSINIL

The above example would return:

<People xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
 <Person>
  <PersonId>1</PersonId>
  <Name>Zach Stagers</Name>
  <Age>22</Age>
 </Person>
</People>

There are a couple of other things to remember when using FOR XML AUTO, and that is that the order of your XML elements is down the order in which your columns are specified. For example, executing the below two queries would give you the same information, but using them with FOR XML AUTO will give you completely different XML:

SELECT Person.Name,
       Orders.Item
 
FROM Person 
     JOIN Orders ON Orders.PersonId = Person.PersonId
 
FOR XML AUTO, ROOT('PeopleWithOrders')

Returns:

<PeopleWithOrders>
 <Person Name="Zach Stagers">
  <Orders Item="Bike" />
  <Orders Item="Laptop" />
  <Orders Item="Phone" />
 </Person>
</PeopleWithOrders>
SELECT Orders.Item,
       Person.Name
 
FROM Person 
     JOIN Orders ON Orders.PersonId = Person.PersonId
 
FOR XML AUTO, ROOT('PeopleWithOrders')

The above query, with the columns swapped over will return:

<PeopleWithOrders>
 <Orders Item="Bike">
  <Person Name="Zach Stagers" />
 </Orders>
 <Orders Item="Laptop">
  <Person Name="Zach Stagers" />
 </Orders>
 <Orders Item="Phone">
  <Person Name="Zach Stagers" />
 </Orders>
</PeopleWithOrders>

So, you just need to remember that the order of the columns is what specifies the output of your XML, not the order of the tables in the FROM clause. The names given to the tables, however, specifies the names given to the XML nodes.

Digg This
Reddit This
Stumble Now!
Buzz This
Vote on DZone
Share on Facebook
Bookmark this on Delicious
Kick It on DotNetKicks.com
Shout it
Share on LinkedIn
Bookmark this on Technorati
Post on Twitter
Google Buzz (aka. Google Reader)

Recursive CTEs

Recursive Common Table Expressions allow you to perform a recursive query, without the use of a resource intensive Cursor.

In this post, I’ll show you how to write a very simple Recursive CTE, and explain how it works.

The example that I’ll be working with in this post is as below:

;WITH RCTE AS
       (
        SELECT 1 AS [Recursion Level], 'A' [Type]
 
        UNION ALL
 
        SELECT [Recursion Level] +1 AS [Recursion Level], 'R' [Type]
        FROM RCTE
        WHERE [Recursion Level] < 5
       )
 
SELECT * FROM RCTE
OPTION(MAXRECURSION 4)

This example obviously has no practical use, but it’s good for showing how a Recursive CTE works.

The query starts in the exact same way as a normal CTE, ;WITH RCTE AS. The semicolon terminates any SQL defined before the CTE, and it’s best practice to add a semicolon here for that reason. If any preceding code (Commonly the declaration of variables) isn’t terminated by a semicolon, an error will occur.

The second line, SELECT 1 AS [Recursion Level], ‘A’ [Type] is known as the “Anchor Member”. This query defines the base level item that you would like to build upon. This simple select just creates a single record with two columns.

UNION ALL joins the two parts of the recursive query together, and must be defined after your anchor query.

The query defined after the UNION ALL is known as the “Recursive Member”. This query must reference the CTE in its FROM clause. This is so the query knows what base values it has, and can therefore build upon. The where clause, WHERE [Recursion Level] < 5, is a terminator statement, which stops the query from looping indefinitely. The CTE is then closed off with a closing a bracket.

A simple SELECT * FROM RCTE is all that is needed to pull back all of the required information from the CTE in this example.

OPTION(MAXRECURSION 4) is essentially another safety net to stop the query from spiralling into an infinite loop. This stops the query from running at a maximum of 4 recursions, but be aware that if 4 recursions isn't enough to extract all of the data within the loop - an error will occur. For example, If I changed my WHERE clause to less than 6 in the Recursive Member, and left my Max Recursion option at 4 I would recieve the following error:

The statement terminated. The maximum recursion 4 has been exhausted before statement completion.

The Max Recursion clause is optional.

The result set from the example query would like the below:

Recursive CTE Result

Recursive CTE Result

As you can see, 5 results were returned – even though WHERE [Recursion Level] < 5 was specified in the Recursive Member, and OPTION(MAXRECURSION 4) is defined in the outer query. This is because the Anchor Member (First record in the result set, type “A”) doesn’t count as a recursion. There are, however, 4 recursion records (type “R”).

A more practical use of a Recursive CTE would be to build something like a family tree, or a companies employee structure.

Digg This
Reddit This
Stumble Now!
Buzz This
Vote on DZone
Share on Facebook
Bookmark this on Delicious
Kick It on DotNetKicks.com
Shout it
Share on LinkedIn
Bookmark this on Technorati
Post on Twitter
Google Buzz (aka. Google Reader)

Managing Primary Key Constraints

Primary Key Constraints keep your rows unique, and therefore help maintain your datas integrity.

They can be applied to a single column, or across multiple columns (A maximum of 16). Any column or combination of columns referenced by the Primary Key will be unique, and because of this, they’re usually chosen as the Identity column as well.

A Primary Key column cannot be nullable, due to the fact that it forces uniqueness in the table. Computed columns are allowed to be used as a Primary Key.

When a Primary Key is created, the columns it’s created on also have a clustered index created on them. If you’d like to override this functionality, you can do so by specifying NONCLUSTERED during the creation of the key.

Adding a Primary Key Constraint to an existing table

ALTER TABLE Products
ADD PRIMARY KEY (ProductId)

The above code will modify the Products table to add a Primary Key Constraint to the ProductId column.

ALTER TABLE Products
ADD PRIMARY KEY NONCLUSTERED (ProductId, ProductName)

The above code will modify the Products table to add a Primary Key Constraint across two columns, but also overrides the default action of creating a clustered composite index, and creates a nonclustered composite index instead. What do I mean by composite? This means that the index references multiple columns.

Adding a Primary Key Constraint during table creation

CREATE TABLE Products 
  (
   ProductId INT PRIMARY KEY IDENTITY,
   ProductName VARCHAR(50))

In the above example, we’re adding the Primary Key to the integer column ProductId. The same column is also being specified as the identity column for the table, as I mentioned earlier, this is fairly common.

Removing a Primary Key Constraint

A table can only ever have one Primary Key, and you may reassess your needs and have to drop the existing Primary Key.

In order to do this, you’ll first need to find the name of the primary key that you’d like to drop. You can do this by executing the below code (replace “Schema.Table” with your details, but keep the single quotes):

SELECT I.name [PrimaryKeyName]
 
FROM sys.tables T 
     JOIN sys.indexes I ON I.OBJECT_ID = T.OBJECT_ID
 
WHERE T.OBJECT_ID = OBJECT_ID('Schema.Table') 
      AND I.is_primary_key = 1

Once you’ve run the above code, and the name has been returned, you can run the below code to drop the Primary Key:

ALTER TABLE table_name
DROP CONSTRAINT primary_key_constraint_name
Digg This
Reddit This
Stumble Now!
Buzz This
Vote on DZone
Share on Facebook
Bookmark this on Delicious
Kick It on DotNetKicks.com
Shout it
Share on LinkedIn
Bookmark this on Technorati
Post on Twitter
Google Buzz (aka. Google Reader)

Case Statements

Case Statements are used to output different data depending on the value returned from a column:

CASE
 WHEN Column1 = 'Seven' THEN 7
 WHEN Column1 = 'Eight' THEN 8
 ELSE 0
 END AS [Number]

The above syntax would be used within a select statement, and would be placed along side all other columns in your column list. This example would check all data returned from Column1, if it finds the word ‘Seven’ then the number 7 will be returned in its place, the same goes for ‘Eight’ and 8. Anything that does not equal ‘Seven’ or ‘Eight’ will be returned as a 0.

As many WHEN lines can be written as are necessary.

It’s possible to nest Case statements up to a maximum of 10 levels deep.

Case Statements can also be used within Where clauses, which is particulary useful within a parameterized stored procedure.

For example, if you had a parameter of @Cost, then your Where clause may look like this:

WHERE CASE
       WHEN @Cost = 'All' THEN 1
       WHEN @Cost = 'None' AND Column1 IS NULL THEN 1
       WHEN @Cost NOT IN ('All', 'None') AND Column1 = CONVERT(INT, @Cost) THEN 1
       ELSE 0
       END = 1

The above example is checking the value of our Varchar parameter @Cost, then performing any search functionality that goes along with it.

When @Cost is All, there is no other search critiria written as we want to return everything. When @Cost is None, then we only want to return columns with a NULL entry, so we check that @Cost is ‘None’ and our Column1 IS NULL.

When we want to work with the @Cost parameter as an INT, we have to explicitly define that it’s not All or None, then specify our search “Column1 = Covert(INT, @Cost)”. If we do not explicitly define that we are no longer searching for All or None within @Cost, an error will occur for trying to compare an Int and a Varchar value.

The last use of the CASE statement that I’d like to mention is within an UPDATE statement. You can use the CASE to define what value should be set depending on the value of another column, for example:

UPDATE Table1
SET Column1 = CASE 
               WHEN Column2 = 'Seven' THEN 7
               WHEN Column2 = 'Eight' THEN 8
               END
WHERE Column1 IS NULL
Digg This
Reddit This
Stumble Now!
Buzz This
Vote on DZone
Share on Facebook
Bookmark this on Delicious
Kick It on DotNetKicks.com
Shout it
Share on LinkedIn
Bookmark this on Technorati
Post on Twitter
Google Buzz (aka. Google Reader)

Output Into – Audit Trails

Output & Output Into can be used to track the details of data that’s inserted, updated, or deleted.

Ouput is a “simplified” version of Output Into. Output will simply print the results to screen, where as Output Into will insert the results into a specified table.

These keywords have access to two sytem tables, inserted and deleted. The inserted table contains all data that is added, anything inserted into a table with an insert statement, or the new data added within an update statement. The deleted table holds the opposite, anything removed with a delete statement, or anything overwritten with an update statement.

The below examples show how to use the Output clause with a Delete, Insert, and Update Statement.

DELETE FROM Person
OUTPUT deleted.PersonId, deleted.Name INTO PersonAudit
WHERE PersonId = 73
INSERT INTO Person
OUTPUT inserted.PersonId inserted.Name INTO PersonAudit
VALUES('Zach Stagers')
UPDATE Person
SET Name = 'Zach Stagers'
OUTPUT inserted.PersonId, inserted.Name, deleted.Name INTO PersonAudit
WHERE PersonId = 88

With each of the above examples, remove INTO PersonAudit to use the simplified version mentioned earlier, and just print the results to screen.

With a little bit of planning and correct use of the Output Into clause, you have the ability to create a very useful audit trail.

Digg This
Reddit This
Stumble Now!
Buzz This
Vote on DZone
Share on Facebook
Bookmark this on Delicious
Kick It on DotNetKicks.com
Shout it
Share on LinkedIn
Bookmark this on Technorati
Post on Twitter
Google Buzz (aka. Google Reader)

Moving Database Objects between Schemas the quick way

I’ve been doing a bit of database re-structuring lately, and I came across a nice and quick way to move database objects between schemas. Instead of dropping and recreating the object in the schema you require it to be in – simply execute the below:

ALTER SCHEMA MySchema TRANSFER dbo.Sproc;

The above example will move the object “dbo.Sproc” into the “MySchema” schema.

Short and sweet.

Digg This
Reddit This
Stumble Now!
Buzz This
Vote on DZone
Share on Facebook
Bookmark this on Delicious
Kick It on DotNetKicks.com
Shout it
Share on LinkedIn
Bookmark this on Technorati
Post on Twitter
Google Buzz (aka. Google Reader)

SQL Coalesce

I’ve been using this sytem function a lot lately, and it’s really useful!

I’m having to check for a phone number in several different places, and using Coalesce helped me avoid using a long drawn out Case Statement.

COALESCE(expression 1, expression 2, expression 3) [Phone Number]

The above code will be placed in your column select list. The Coalesce will check expression 1 (The first column you want to check) for a value, and if one is found, it will be returned, but if it’s NULL, then expression 2 will be checked, if that’s NULL, then expression 3 will be checked.

The function will check all of the expressions listed within the bracers until a value is found, so if expression 1 has a value, it will be returned and the function wont bother checking the others. If all of the expressions listed are NULL, then a NULL will be returned.

The “[Phone Number]” will just name the column “Phone Number” in the result set.

You may pass in as many expressions as necessary, but remember – the more expressions the function has to check, the worse your performance will be.

Digg This
Reddit This
Stumble Now!
Buzz This
Vote on DZone
Share on Facebook
Bookmark this on Delicious
Kick It on DotNetKicks.com
Shout it
Share on LinkedIn
Bookmark this on Technorati
Post on Twitter
Google Buzz (aka. Google Reader)

Returning Tabular Data as XML with FOR XML RAW

Working with XML in SQL isn’t something that I’ve done much of in the past. Over the last few weeks, however, I have been called upon to produce data in XML form – which has lead to a bit of research and reading on my part, and I thought I’d post on my findings.

FOR XML RAW appears to be the easiest of the 4 available XML modes in SQL. I will post on the other 3 (Auto, Explicit, and Path) at some point, but for now I’ll explain Raw.

I’ll start with an example, and work my way through it with explanations afterwards.

SELECT 1 AS [Column1]
UNION
SELECT 2 AS [Column2]
UNION
SELECT NULL AS [Column3]
FOR XML RAW('Test'), ROOT('Rows'), ELEMENTS XSINIL

Running the above SQL will return the below XML:

<Rows xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  <Test>
    <Column1 xsi:nil="true" />
  </Test>
  <Test>
    <Column1>1</Column1>
  </Test>
  <Test>
    <Column1>2</Column1>
  </Test>
</Rows>

Everything after the keyword RAW is optional in the query, but it all helps tidy up your result set, as I’m about to explain.

Specifying FOR XML RAW simply turns your tabular data into an XML Fragment. An XML Fragment is essentially part of a full XML structure, it doesn’t contain a root element – such as “Rows” in the example XML. Each of the 3 select’s in the example would create a single fragement line such as the below:

<row />
<row Column1="1" />
<row Column1="2" />

Specifying (‘Test’) applies the name Test to each of the XML Fragement Elements. In the above XML Fragment example, each occurrence of the word “row” would be replaced with “Test”.

Adding ROOT(‘Rows’) places root tags around your XML Fragment, turning the output into this:

<Rows>
  <Test />
  <Test Column1="1" />
  <Test Column1="2" />
</Rows>

Next we have the keyword Elements, which transforms your XML output into XML Elements, rather than XML Attributes:

<Rows>
  <Test />
  <Test>
    <Column1>1</Column1>
  </Test>
  <Test>
    <Column1>2</Column1>
  </Test>
</Rows>

Lastly theres XSINIL. This is how XML handles SQL NULL’s, which are known as NIL in XML. If this isn’t specified and NULL is returned in your result set, an empty element will be listed – such as the self closing Test tag in the last XML example. Using this option will add an XML namespace as an attribute of the root, and add xsi:nil=”true” as an attribute to any NULL / NIL records. Note that this option can only be specified if ELEMENTS is used.

Digg This
Reddit This
Stumble Now!
Buzz This
Vote on DZone
Share on Facebook
Bookmark this on Delicious
Kick It on DotNetKicks.com
Shout it
Share on LinkedIn
Bookmark this on Technorati
Post on Twitter
Google Buzz (aka. Google Reader)

Managing Check Constraints

Check Constraints are an important part of data validation, and should be used at every opportunity.

A lot of DBAs seem to trust the checking done against data before it reaches their database is sufficient enough, but this is a very bad assumption to make. Sometimes people miss things, human error – it happens, unfortunately!

A website may be missing one small detail in its validation, which could cause a major head ache later down the line, and this is why it’s always important to do your own validation.

How they work

A check constraint is essentially a WHERE clause added to a table to confirm that the data being entered into the table is valid.

The boolean check of the constraint decides whether or not the data is allowed to be inserted.

Adding a Check Constraint to an existing table

ALTER TABLE Products
ADD CONSTRAINT ProductsPriceCheck CHECK (Price > 0)

The above code will add a check constraint, named ProductsPriceCheck, to the Products table which checks that the column Price is greater than zero. If someone attempts to enter a value which breaks this rule, an error will be raised.

Please remember that even though NULL isn’t greater than zero, it will get through this check! If you don’t want NULL values to be entered, specify “AND Price IS NOT NULL” in the check – or better yet, add “NOT NULL” to the column.

Adding a Check Constraint during table creation

CREATE TABLE dbo.Products 
   (
    ProductId INT IDENTITY PRIMARY KEY,
    ProductName VARCHAR(45) NOT NULL,
    Price MONEY NOT NULL,
    CONSTRAINT ProductsPriceCheck CHECK (Price > 0)
   )

The syntax here is very similar, but there is now no longer any need for the keyword ADD, as you arn’t adding the constaint to an existing table, it’s being created along with the table.

A check constraint can span multiple columns within the table, but cannot reference anything outside of the table it’s being created against. In order to use multiple columns, reference them exactly as you would when writing a WHERE clause against the table.

A very practical example of a Check Constraint doing what it does best would be against a column that holds an email address:

CREATE TABLE dbo.Person 
   (
    PersonId INT IDENTITY PRIMARY KEY,
    FirstName VARCHAR(45) NOT NULL,
    Surame VARCHAR(45) NOT NULL,
    Email VARCHAR(100) NOT NULL,
    CONSTRAINT PersonEmailCheck CHECK (Email LIKE '%@%' AND Email LIKE '%.%')
   )

This example makes sure that the column Email has an @ symbol and a period – two of the things that absolutely every email address will have.

Removing a Check Constraint

Should you decide that a Check Constraint is no longer needed, then you’ll be able to delete it from the Object Explorer by browsing to the Server > Database > Tables > Expand the table you’re deleting the check constraint from > Constraints > Right click on the constraint and select Delete.

You can also remove the constraint by executing the below SQL:

ALTER TABLE table_name
DROP CONSTRAINT constraint_name
Digg This
Reddit This
Stumble Now!
Buzz This
Vote on DZone
Share on Facebook
Bookmark this on Delicious
Kick It on DotNetKicks.com
Shout it
Share on LinkedIn
Bookmark this on Technorati
Post on Twitter
Google Buzz (aka. Google Reader)

Soundex – Overcome Spelling Issues

Spelling is a big issue of mine, I just can’t do it! The Soundex built in SQL function helps overcome any spelling issues by assigning a value to a varchar depending on the phonetics of the word.

For instance, people are constantly spelling my name incorrectly.. Zac, Zack, Zak.. it’s Zach!!

Passing any of the above into the soundex function, will all return a soundex value of “Z200″:

SELECT SOUNDEX('Zach')

This means that if someone emails me asking about an account for Zak Stagers, I can use the below SQL to find the correctly spelled account:

SELECT FirstName
 
FROM Users
 
WHERE SOUNDEX(FirstName) = SOUNDEX('Zak')

The above will return all accounts that have a Soundex value of Z200, allowing me to find the correct account even without the correct spelling

Digg This
Reddit This
Stumble Now!
Buzz This
Vote on DZone
Share on Facebook
Bookmark this on Delicious
Kick It on DotNetKicks.com
Shout it
Share on LinkedIn
Bookmark this on Technorati
Post on Twitter
Google Buzz (aka. Google Reader)