An SQL Server Game Changer
One day, I discovered the ease and utility of creating a temporary table in SQL Server. I thought it was a game changer. I used it everywhere, for any occasion, and was pretty quick to throw data into temporary tables for ease of querying later.
I have since learned there are some hidden costs and nuances to using a temporary table that I was previously unaware of. This doesn’t mean temporary tables are bad and should be avoided. It does mean there are times when a temporary table should be used, and others where a table variable is more appropriate.
Temporary Table vs Table Variable
There are two ways to create a temporary table:
1. ‘On the fly’ by designating an “INTO” clause after a SELECT statement and before a FROM clause:
2. The same way as defining an actual table by using the following syntax:
The second approach requires you to pre-designate which fields you’ll be storing and then execute insert commands to populate the table. From a simplicity standpoint, most people creating temporary tables would be more likely to use the “INTO” clause, so that’s what we’ll focus on in this blog post.
A table variable is set up in a very similar manner to creating a temporary table the second way. The syntax is as follows: “DECLARE @Temporary TABLE (an INT).” You can also populate a table variable with INSERT statements in the same manner that you would do for a stored table.
Creating Temporary Tables Using the “INTO” Clause
The appeal of the “INTO” clause is that when you have an expensive query to calculate in order to get results, you can just slap it all into a temporary table to keep it “in memory.” While this temporary table is “in memory,” you can execute on it as if it were a physical table. I am deliberately quoting the “in memory” portion above because that is a common misconception about how the temporary tables work, one that I was taking for granted.
As a matter of fact, a temporary table is a physical table that is stored in the TempDB schema (by default). Once the connection that was accessing the temporary table is dropped, so too is the physical table, meaning that cleanup is automatically handled for you.
Considerations for Using the “INTO” Clause Approach
Since the table is stored as a physical table, it can be affected by system transactions, for example, a ROLLBACK operation. Granted, a database is not likely to be getting the ROLLBACK execution run on it terribly often, but it could unexpectedly coincide with accessing a temporary table.
For example, a calculation is in the middle of processing yearly raises for everyone, and all the employees are stored in a temporary table. If a ROLLBACK operation occurs during execution for any reason, the result is an empty employee temporary table. As far as the execution is aware it is done executing the payroll updates, despite how much progress was made on the actual employee roster. This may be a farfetched example, but it does illustrate a pretty notable risk for any critical operation that utilizes a temporary table.
Table Variables and Data Integrity
Unlike temporary tables, table variables persist data even after a ROLLBACK command is executed. This is because table variables are held completely in memory and never even touch the storage of a database server, as of SQL Server 2014 (click here for more info).
Cleanup of Table Variables
Cleanup with table variables is simply the object being unloaded from memory as soon as the scope of the table variable’s declaration expires. An explicit drop of the variable isn’t required, in order to reinitialize it across consecutive runs, unlike temporary tables. Using the above example for temporary tables, if the employees who need a pay raise calculation are stored in a table variable, and a ROLLBACK is executed, the contents of the table variable remain untouched, and the process continues calculating without interruption.
Requirements to Distinguish Between Appropriate Use
As I said previously, there are times when a temporary table is a correct response, and others when a table variable is more suitable. Martin Smith wrote an extensive analysis on the merits and differences of each on StackOverflow and also on the DBA StackOverflow site.
Paraphrasing from Martin’s analysis, the best time to use a temporary table is when your requirements fit one of these categories:
1. You need an index in the table that cannot be created with the keywords UNIQUE or PRIMARY KEY
2. You need to add large amounts of data to and/or bulk remove from the table
3. You need to dynamically alter the execution plan based on the
4. You want to prevent the truncation of the TempDB transaction log until the user transaction ends the content of the table
5. You want to take advantage of Parallelism (more info)
The best time to use a table variable is when your requirements fit one of these categories:
1. You do not need to dynamically alter your execution plan based on the content of the table
2. You need the data in the table to survive a rollback operation
3. You are dealing with small quantities of data
Using Parameterized Execution Blocks
I have also discovered through experimentation that when using parameterized execution blocks (by passing a quoted select statement to sp_executesql and defining parameters, which is a handy way to avoid SQL-Injection attacks), a table variable is not very practical as it cannot easily be passed as a parameter.
For example, assume we have created a table variable named simply @tv. If we attempt to pass this variable as a parameter to the sp_executesql procedure, we will get an error indicating that the type for that parameter cannot be determined. This can be addressed by writing to the System Type table (sys.types in MS SQL 2010) and create an entry specifically for the contents of the table variable.
This approach poses two issues:
1. You will need to add an entry to the system types table every single time you need a table variable with a slightly different schema from a previously defined type each time
2. You may not know exactly how the data will be formatted prior to runtime, and this approach requires that explicit knowledge in order to work before execution
Meanwhile, a temporary table (#tv for reference) can be declared outside of the sp_executesql block and still referenced as if it were any other table without ever having to pass it in as a parameter. You can literally add the temporary table to your execution string as a “JOIN #tv b ON (a.id = b.id).” As long as the declaration of the temporary table is in scope, it will work.
On Using Temporary Tables in C#
I have also discovered through experimentation that there are issues with trying to generate a stored procedure with temporary tables to be used by any code-behind language. Namely, the processor is unable to determine the return type of a temporary table, and thus throws the following error:
Error: Unknown Return Type, The return types for the following stored procedures could not be detected
This is one of the big issues with trying to use a temporary table. Since the table is created and stored on the disk at run time, the parser cannot determine the contents of the temporary table during compilation. To resolve that issue, however, simply swap out the temporary table creation for table variable declaration, and that should resolve the issues presented. Ritesh Kesharwani wrote a succinct blog post on what the error was, its cause and its resolution at his blog.
There are different options available for holding data queried from a database for later consumption or calculation. It is important to note what they are when they are best used, and what to avoid.
- SQL Server Blog: SQL Server 2014 In-Memory OLTP: Memory-Optimized Table Types and Table Variables
- Stack Overflow
- Ritesh Kesharwani: “Error: Unknown Return Type, The return types for following stored procedures could not be detected….(LINQ)”
- Using Table Variable with SP_ExecuteSQL
- Understanding and Using Parallelism in SQL Server