Subscribe

RSS Feed (xml)

Powered By

Skin Design:
Free Blogger Skins

Powered by Blogger


Tuesday, September 23, 2008

Using Column Sets with Sparse Columns

Although the cost of storage is comparatively low, we still need to consider saving space by using various techniques such as compression and archiving. When you think of space savings the first thing you think of is the file system however space savings can be applied to the databases as well. When we create a database we ensure that the data files are created with the appropriate size and growth rate. We periodically analyze the sizes of our databases and perform shrinking operations. We may be performing these tasks for different purposes, but in a way, these tasks help us to ensure optimal storage for our databases. Microsoft SQL Server provides us various techniques for reducing the space used by a database. SQL Server 2008 introduces a technique for addressing nullable columns that provides optimal storage for nullable columns. This new feature in SQL Server 2008 is known as SPARSE columns. This article does not cover much about the features of SPARSE columns, but it does cover the usage of sparse columns with Column Sets, and the things you need to know and consider when using them.

This article contains:
What is a SPARSE Column?
What is a Column Set?
Inserting and updating data in a column set
Tracking modifications with triggers
Applying security on a column set

What is a Sparse Column?
The sparse column is an ordinary column just like other columns but it reduces the storage requirement for null values. A nullable column can be made as sparse column by adding the SPARSE keyword when the table is created or altered. Once the column is a SPARSE column SQL Server will not allocate space for null values. Note that in using this feature it adds an overhead for data retrieval of non-null values. Therefore you need to carefully apply this feature for columns by calculating the space that can be saved. It is recommended to make the column a SPARSE column only if the space that could be saved is at least 20 to 40 percent. BOL provides a table that contains the required null percentages in the columns for each data type in order to make the columns a SPARSE column. Refer to the article "Using Sparse Columns" in BOL for more info.

What is a Column Set?
The Column set is a column that represents all the sparse columns added to the table as a xml-typed column. It is not physically available in the table, it is just like a calculated column but it allows you to make modifications on it. It is recommended to have a column set only if you have lot of sparse columns in you table because it speeds up the modifications and the retrieval if the column set is used instead of the sparse columns individually. You can get more information about column sets if you refer to "Using Column Set" in BOL.

The following code shows the way of creating a column set for a table.

Code 1: Creating a table with sparse columns and a column set.

CREATE TABLE [dbo].[Customers]
(
[Id] int PRIMARY KEY,
[FirstName] varchar(50) NOT NULL,
[LastName] varchar(50) NOT NULL,
[Gender] bit SPARSE NULL, -- 1 = male, 2 = female
[Telephone] varchar(15) SPARSE NULL,
[MonthlyIncome] money SPARSE NULL,
[Comments] varchar(1000) SPARSE NULL
[AllSparseColumns] xml COLUMN_SET FOR ALL_SPARSE_COLUMNS
)

I have added the SPARSE keywords for all nullable columns, but as I mentioned earlier, the percentage of null values should be analyzed before making them sparse columns. Note that you need to add this column when you create the table. SQL Server does not allow you to add a column set by using the ALTER TABLE statement, but it does allows you to have a column set column without having sparse columns. The columns that are added later as sparse columns will be available with the added column set. See the code given below;

Code 2: Creating a table with a column set, without making any columns as sparse columns.

-- adding column set without sparse columns
CREATE TABLE [dbo].[Customers_1]
(
[Id] int PRIMARY KEY,
[FirstName] varchar(50) NOT NULL,
[LastName] varchar(50) NOT NULL,
[Gender] bit NULL, -- 1 = male, 2 = female
[Telephone] varchar(15) NULL,
[MonthlyIncome] money NULL,
[Comments] varchar(1000) NULL,
[AllSparseColumns] xml COLUMN_SET FOR ALL_SPARSE_COLUMNS
)

-- inserting a record
INSERT INTO dbo.Customers_1
([Id], [FirstName], [LastName], [Gender], [Telephone], [MonthlyIncome], [Comments])
VALUES
(1, 'Dinesh', 'Priyankara', 1, '777395871', 20000, 'no comments')

-- this returns null
SELECT AllSparseColumns FROM dbo.Customers_1

-- Make the Gender column as a sparse column
ALTER TABLE [dbo].[Customers_1]
ALTER COLUMN [Gender] bit SPARSE NULL
GO

-- Make the Telephone column as a sparse column
ALTER TABLE [dbo].[Customers_1]
ALTER COLUMN [Telephone] varchar(15) SPARSE NULL

-- Now it returns values of sparse columns as a xml
SELECT AllSparseColumns FROM dbo.Customers_1
Inserting and updating data in a Column Set
Records can be inserted to the sparse columns without using the column set but once inserted, records can be retrieved by using the column set.

Code 1: Inserting a record without inserting a value to the column set.

-- Insert a record to the table.
INSERT INTO dbo.Customers
([Id], [FirstName], [LastName], [Gender], [Telephone], [MonthlyIncome], [Comments])
VALUES
(1, 'Dinesh', 'Priyankara', 1, '777395871', 20000, 'no comments')

-- Retrieve the record and see
SELECT [Id], [FirstName], [LastName], [Gender], [Telephone], [MonthlyIncome], [Comments], [AllSparseColumns] FROM dbo.Customers

/*
Result:
177739587120000.0000no comments
*/

Records can be inserted and updated by using the column set. Code 2 shows the way of inserting a record and updating a record via column set.

Code 2: Inserting and updating the column set.

-- Inserting a new record. Note that the statement uses the column set to
-- insert values for Comments and Telephone columns
INSERT INTO dbo.Customers
([Id], [FirstName], [LastName], [AllSparseColumns])
VALUES
(3, 'Yeshan', 'Santhush', 'No comments777225656')

-- Update the record.
-- This makes Comments column NULL because xml string does not contain a node for Comments column.
-- This updates the Telephone column with the new value.
UPDATE dbo.Customers
SET [AllSparseColumns] = '7774546321'
WHERE Id = 3

Tracking modifications with triggers
This is bit tricky. Normally we use the UPDATE() function to find out whether a particular column is updated or not. If you have implemented this in a trigger that is associated with a table that has sparse columns and a column set, the return value of the UPDATE() function will not be what you anticipated. The folloLet us test this one too.

Code 1: Creating a UPDATE trigger on the Customers table.

-- Creating a update trigger on Customers table.
CREATE TRIGGER tr_Customers_Update ON dbo.Customers
FOR UPDATE
AS
BEGIN
IF UPDATE(Gender)
print 'Gender column updated.'
IF UPDATE(Telephone)
print 'Telephone column updated.'
IF UPDATE(Comments)
print 'Comments column updated.'
IF UPDATE(AllSparseColumns)
print 'AllSparseColumns column updated.'
END

When you update the column set explicitly, the UPDATE() function returns true for the column set. Not only that, the UPDATE() functions that are written to all sparse columns return true too. When a sparse column is explicitly updated, the UPDATE() function returns true for the sparse column and for the column set.

Code 2: Updating the table and testing the UPDATE() function in the trigger.

-- Update the column set.
-- This update makes all UPDATE() functions
-- to return true.
UPDATE dbo.Customers
SET [AllSparseColumns] = '4455'
WHERE Id = 3

/*
Result:
Gender column updated.
Telephone column updated.
Comments column updated.
AllSparseColumns column updated.

(1 row(s) affected)
*/

-- Update the Gender column.
-- This update makes UPDATE() function of
-- Gender column and column set to return true.
UPDATE dbo.Customers
SET Gender = 1
WHERE Id = 3

/*
Result:
Gender column updated.
AllSparseColumns column updated.

(1 row(s) affected)
*/

If you have written the same trigger for INSERT statements, you will see the same behavior for the INSERT operation. When a value is inserted to one of the sparse columns and you make the others NULL, the UPDATE() function returns true to the sparse column and to the column set. When a value is inserted to the column set, the UPDATE() function returns true to the columns set and for the all sparse columns.
Applying security on a column set
Security can be applied to a column set just like applying security to other columns, but permissions on sparse columns may affect the data retrieval from the column set. Let's do some testing on.

First, let's grant SELECT permission to all sparse columns and try to retrieve data from the column set. You need to have a separate account for this test. If you do not have an additional account, create a login and a user as User1. Let us try to retrieve data with User1 permissions.

Code 1: Retrieving and updating data with User1's account

-- Set the execution context to the user User1
EXECUTE AS USER = 'User1'

-- select statement 1
SELECT Gender, Telephone, MonthlyIncome, Comments FROM Customers

-- select statement 2
SELECT AllSparseColumns FROM Customers

-- select statement 3
UPDATE dbo.Customers
SET Gender = 1
WHERE Id = 3

-- select statement 4
UPDATE dbo.Customers
SET [AllSparseColumns] = '777225656Test msg1'
WHERE Id = 3

REVERT

Code 2: Granting SELECT permission on sparse columns to User1 and execute Code 1

-- Grant select permission to all sparse columns
GRANT SELECT (Gender, Telephone, MonthlyIncome, Comments) ON OBJECT::dbo.Customers TO User1

-- Execute the code 1:
-- select statement 1 - will success
-- select statement 2 - will fail
-- select statement 3 - will fail
-- select statement 4 - will fail

-- Remove SELECT permission from User1
REVOKE SELECT (Id, Gender, Telephone, MonthlyIncome, Comments) ON OBJECT::dbo.Customers TO User1

Though we have granted SELECT permission to all sparse columns, users cannot retrieve data from the column set. It requires explicit SELECT permission. But if we grant both SELECT and UPDATE permissions on sparse columns, User1 will be able to access the column set. But User1 will not be able to update the column set.

Code 3: Granting SELECT and UPDATE permission on sparse columns to User1 and execute Code 1

-- Grant select permission to all sparse columns
GRANT SELECT, UPDATE (Gender, Telephone, MonthlyIncome, Comments) ON OBJECT::dbo.Customers TO User1

-- Execute the code 1
-- select statement 1 - will success
-- select statement 2 - will success
-- update statement 3 - will success
-- update statement 4 - will fail

-- Remove SELECT, and UPDATE permissions from User1
REVOKE SELECT, UPDATE (Id, Gender, Telephone, MonthlyIncome, Comments) ON OBJECT::dbo.Customers TO User1

Now let us grant SELECT on the column set and try to access sparse columns.

Code 4: Granting SELECT permission on the column set to User1 and execute Code 1

-- Grant select permission to the column set
GRANT SELECT (AllSparseColumns) ON OBJECT::dbo.Customers TO User1

-- Execute the code 1
-- select statement 1 - will fail
-- select statement 2 - will success
-- update statement 3 - will fail
-- update statement 4 - will fail

-- Remove SELECT permission from User1
REVOKE SELECT (AllSparseColumnss) ON OBJECT::dbo.Customers TO User1

Just like the code in Code 3, if we grant both SELECT and UPDATE permissions on column set to User1, SELECT statement 2 will be successful. In addition to that, User1 will be able to execute the UPDATE statement on the column set but not on the sparse columns. See the Code 5.

Code 5: Granting SELECT and UPDATE permissions on the column set to User1 and execute Code 1

-- Grant select and update permissions to the column set
GRANT SELECT, UPDATE (AllSparseColumns) ON OBJECT::dbo.Customers TO User1

-- Execute the code 1
-- select statement 1 - will success
-- select statement 2 - will success
-- update statement 3 - will fail
-- update statement 4 - will success

-- Remove SELECT and UPDATE permission from User1
REVOKE SELECT, UPDATE (AllSparseColumnss) ON OBJECT::dbo.Customers TO User1

Now let us test how DENY permissions are propagated. Let us grant SELECT permission on the sparse column and deny SELECT on column set. As you expect, User1 will be able to access all sparse columns not the column set. Deny SELECT permission on the column set has NO effect no sparse columns.

Code 6: Granting SELECT permissions on sparse columns and denying SELECT on the column set to User1 and execute Code 1

-- Grant SELECT permission on sparse columns
GRANT SELECT (Id, Gender, Telephone, MonthlyIncome, Comments) ON OBJECT::dbo.Customers TO User1

-- Deny SELECT permission on the column set
DENY SELECT (AllSparseColumns) ON OBJECT::dbo.Customers TO User1

-- Execute the code 1
-- select statement 1 - will success
-- select statement 2 - will fail
-- update statement 3 - will fail
-- update statement 4 - will fail

REVOKE ALL ON OBJECT::dbo.Customers TO User1
GO

But when the SELECT permission on sparse columns is denied, it propagates to the column set. See code 7. User1 will not be able to access the column set even though we have granted SELECT permission on the column set.

Code 7: Denying SELECT permissions on sparse columns and granting SELECT on the column set to User1 and execute Code 1

-- Deny SELECT permission on sparse columns
DENY SELECT (Id, Gender, Telephone, MonthlyIncome, Comments) ON OBJECT::dbo.Customers TO User1

-- Grant SELECT permission on the column set
GRANT SELECT (AllSparseColumns) ON OBJECT::dbo.Customers TO User1

-- Execute the code 1
-- select statement 1 - will fail
-- select statement 2 - will fail
-- update statement 3 - will fail
-- update statement 4 - will fail

REVOKE ALL ON OBJECT::dbo.Customers TO User1
GO

Hope you learnt something from this. I would appreciate your comments on this article, please send your comments to dinesh@sqlserveruniverse.com.

No comments:

Post a Comment

Recent Posts

Archives