Saturday, March 21, 2009

TempDB:: Table variable vs local temporary table

As you know the tempdb is used by user applications and SQL Server alike to store transient results needed to process the workload. The objects created by users and user applications are called ‘user objects’ while the objects created by SQL Server engine as part of executing/processing the workload are called ‘internal objects’. In this blog, I will focus on user objects and on table variable in particular.

There are three types of user objects; ##table, #table and table variable. Please refer to BOL for specific details. While the difference between ##table (global temporary table) and #table (local temporary table) are well understood, there is a fair amount of confusion between #table and table variable. Let me walk through main differences between these.

A table variable, like any other variable, is a very useful programming construct. The scoping rules of the table variable are similar to any other programming variables. For example, if you define a variable inside a stored procedure, it can’t be accessed outside the stored procedure. Incidentally, #table is very similar. So why did we create table variables? Well, a table variable can be very powerful when user with stored procedures to pass it as input/output parameters (new functionality available starting with SQL Server 2008) or to store the result of a table valued function. Here are some similartities and differences between the two:

· First, the table variable is NOT necessarily memory resident. Under memory pressure, the pages belonging to a table variable can be pushed out to tempdb. Here is one example shows space taken by table variable in tempdb

use tempdb

go

drop table #tv_source

go

create table #tv_source(c1 int, c2char(8000))

go

declare @i int

select @i = 0

while (@i < 1000)

begin

insert into #tv_source values (@i, replicate(‘a’, 100))

select @i = @i + 1

end

DECLARE @tv_target TABLE (c11 int, c22char(8000))

INSERT INTO @tv_target (c11, c22)

SELECT c1, c2

FROM #tv_source

– checking the size through DMV.

– The sizes here are in 8k pages. This shows the allocated space

– to user objects to be 2000 pages (1000 pages for #tv_source and

– 1000 pages for @tv_target

Select total_size = SUM(unallocated_extent_page_count) +

SUM (user_object_reserved_page_count) +

SUM (internal_object_reserved_page_count) +

SUM (version_store_reserved_page_count) +

SUM (mixed_extent_page_count),

SUM (unallocated_extent_page_count) asfreespace_pgs,

SUM (user_object_reserved_page_count) asuser_obj_pgs,

SUM (internal_object_reserved_page_count) asinternal_obj_pgs,

SUM (version_store_reserved_page_count) asversion_store_pgs,

SUM (mixed_extent_page_count) asmixed_extent_pgs

from sys.dm_db_file_space_usage

· Second, when you create a table variable, it is like a regular DDL operation and its metadata is stored in system catalog. Here is one example to check this

declare @ttt TABLE(c111 int, c222 int)

select name from sys.columns where object_id > 100 and name like‘c%’

This will return two rows containing columns c111 and c222. Now this means that if you were encountering DDL contention, you cannot address it by changing a #table to table variable.

· Third, transactional and locking semantics. Table variables don’t participate in transactions or locking. Here is one example


– create a source table

create table tv_source(c1 int, c2 char(100))

go

declare @i int

select @i = 0

while (@i < 100)

begin

insert into tv_source values (@i, replicate (‘a’, 100))

select @i = @i + 1

end

– using #table

create table #tv_target (c11 int, c22 char(100))

go

BEGIN TRAN

INSERT INTO #tv_target (c11, c22)

SELECT c1, c2

FROM tv_source

– using table variable

DECLARE @tv_target TABLE (c11 int, c22 char(100))

BEGIN TRAN

INSERT INTO @tv_target (c11, c22)

SELECT c1, c2

FROM tv_source

– Now if I look at the locks, you will see that only

– #table takes locks. Here is the query that used

– to check the locks

select

t1.request_session_id as spid,

t1.resource_type as type,

t1.resource_database_id as dbid,

(case resource_type

WHEN ‘OBJECT’ thenobject_name(t1.resource_associated_entity_id)

WHEN ‘DATABASE’ then ‘ ‘

ELSE (select object_name(object_id)

from sys.partitions

where hobt_id=resource_associated_entity_id)

END) as objname,

t1.resource_description as description,

t1.request_mode as mode,

t1.request_status as status,

t2.blocking_session_id

from sys.dm_tran_locks as t1 left outer joinsys.dm_os_waiting_tasks as t2

ON t1.lock_owner_address = t2.resource_address

Another interesting aspect is that if I rollback the transaction involving the table variable earlier, the data in the table variable is not rolled back.

Rollback

– this query will return 100 for table variable but 0 for #table.

SELECT COUNT(*) FROM @tv_target

· Fourth, the operations done on table variable are not logged. Here is the example I tried

– create a table variable, insert bunch of rows and update

DECLARE @tv_target TABLE (c11 int, c22 char(100))

INSERT INTO @tv_target (c11, c22)

SELECT c1, c2

FROM tv_source

– update all the rows

update @tv_target set c22 = replicate (‘b’, 100)

– look at the top 10 log records. I get no records for this case

select top 10 operation,context, [log record fixed length], [log record length], AllocUnitId, AllocUnitName

from fn_dblog(null, null)

where AllocUnitName like ‘%tv_target%’

order by [Log Record Length] Desc

– create a local temptable

drop table #tv_target

go

create table #tv_target (c11 int, c22 char(100))

go

INSERT INTO #tv_target (c11, c22)

SELECT c1, c2

FROM tv_source

– update all the rows

update #tv_target set c22 = replicate (‘b’, 100)

– look at the log records. Here I get 100 log records for update

select operation,context, [log record fixed length], [log record length], AllocUnitName

from fn_dblog(null, null)

where AllocUnitName like ‘%tv_target%’

order by [Log Record Length] Desc

· Fifth, no DDL is allowed on table variables. So if you have a large rowset which needs to be queried often, you may want to use #table when possible so that you can create appropriate indexes. You can get around this by creating unique constraints when declaring table variable.

· Finally, no statistics is maintained on table variable which means that any changes in data impacting table variable will not cause recompilation of queries accessing table variable. Queries involving table variables don’t generate parallel plans.

No comments:

Post a Comment