By: Aaron Bertrand | | Related Tips:More >T-SQL
Recently on Twitter there was a conversation about tabs vs. spaces in T-SQL code. Some preferred tabs because it was less work to press tab once than space bar four times, and others argued that, with fewer characters, T-SQL code would perform better. I’ll leave the former to people who really want to argue about that, and focus on the latter. Can tabs lead to better performance than spaces? Andy Mallon ( @AMtwo ) successfully baited me into testing performance between the two options in his recent T-SQL Tuesday blog post, “ Tabs vs. Spaces .”
The fun with addressing problems like these is always coming up with a test rig. I decided to create two stored procedures, one using spaces, the other tabs, and I mean a lot of spaces or tabs. The workflow is that the procedure creates and assigns a variable for every column in every catalog view, and the assignment statement is polluted with spaces or tabs. Each location of the token
$st gets replaced by either 16 spaces or 4 tabs, with the assumption that when you create tabs you want them to represent 4 spaces. I named the procedures
*Char9s instead of
*Tabs so that the object names themselves would have the same length. And
Char9 represents the fact that a tab, in ASCII, is actually
Here is the code I used to generate the procedures:
DECLARE @spaces varchar(16) = REPLICATE(' ', 16), @tabs varchar(4) = REPLICATE(CHAR(9), 4), -- CHAR(9) = ASCII tab @sql nvarchar(max), @src nvarchar(max) = N'CREATE OR ALTER PROCEDURE dbo.ThisOneUses$type AS BEGIN SET NOCOUNT ON;' SELECT TOP (100) @src += N' $st DECLARE $st @' + v.[name] + c.[name] + N' $st char(10) $st = $st (SELECT $st ' + N'MAX(LEFT(RTRIM(' + QUOTENAME(c.[name]) + '),10))' + N'$st FROM $st sys.' + v.[name] + ');' FROM sys.all_views AS v INNER JOIN sys.all_columns AS c ON v.[object_id] = c.[object_id] WHERE v.[schema_id] = 4 -- only sys AND c.system_type_id NOT IN (34,98); -- image, sql_variant SELECT @src += N'END'; EXEC sys.sp_executesql @sql; -- create tabs version SET @sql = REPLACE(REPLACE(@src,N'$type',N'Char9s'),N'$st',@tabs); EXEC sys.sp_executesql @sql; -- create spaces version SET @sql = REPLACE(REPLACE(@src,N'$type',N'Spaces'),N'$st',@spaces); EXEC sys.sp_executesql @sql; GO
It's a bit ugly, but essentially, it ends up with procedures like this, the only difference being that in one all that white space is full of spaces, and in the other, it’s tabs:
First, let’s check whether all of these spaces really add up to a substantially larger definition of a stored procedure:
SELECT p.[name], size = DATALENGTH(m.[definition]) FROM sys.all_sql_modules AS m INNER JOIN sys.procedures AS p ON m.[object_id] = p.[object_id] WHERE p.[name] LIKE N'ThisOneUses%';
So, yes, the first bit turns out to be true; the size of the procedure in plain text form is quite different:
Now, what if we compress the value?
SELECT DATALENGTH(COMPRESS(OBJECT_DEFINITION(OBJECT_ID(N'dbo.ThisUsesSpace')))), DATALENGTH(COMPRESS(OBJECT_DEFINITION(OBJECT_ID(N'dbo.ThisUsesChar9'))));
Now the difference is much more negligible:
Let’s run each query 100 times, and we can measure other performance aspects using our good old pal
DBCC FREEPROCCACHE; SET ANSI_WARNINGS OFF; GO EXEC dbo.ThisOneUsesSpaces; GO 100 EXEC dbo.ThisOneUsesChar9s; GO 100 SELECT LEFT(t.[text], 41), execution_count, total_elapsed_time, max_elapsed_time, avg_elapsed_time = CONVERT(DECIMAL(19,2), total_elapsed_time * 1.0 / execution_count) FROM sys.dm_exec_procedure_stats AS p CROSS APPLY sys.dm_exec_sql_text(p.[sql_handle]) AS t WHERE t.[text] LIKE '%ThisOneUses%';
The results show that while there are outliers (both versions had executions with really high runtimes), the total and average elapsed times were pretty darn close. Those numbers are in microseconds, so the average is off by a whole millisecond :
And if we execute this simpler batch in Plan Explorer (which will allow us to capture actual duration and compile time):
EXEC dbo.ThisOneUsesSpaces; EXEC dbo.ThisOneUsesChar9s; EXEC dbo.ThisOneUsesSpaces WITH RECOMPILE; EXEC dbo.ThisOneUsesChar9s WITH RECOMPILE;
We see that there is still negligible difference, and in fact while the tabs won out in one case, they lost when a
RECOMPILE was requested. At this scale it’s hard to really pick one as a winner; you could run this batch 100 times (and in fact I did), and you’re likely to see the scales tip each way 50 times.
Now, you could try doing this with a lot more statements, and I did – from 1,000 all the way up to over 11,000, which was based on the number of system columns on my specific build of SQL Server. All you’ll find is that the differences continue to be negligible; it just takes a lot longer to run the tests. So please feel free to run these tests on your own system and see if you get different results – just change SELECT TOP (100) to whatever number you like, knowing that it just increases the number of statements that are run and the number of spaces or tabs that are in the source code.
You can also confirm that the stored execution plan is the same size – the plan itself is completely oblivious to the characters used in the query text that drove the plan:
While I will concede that the size of the stored procedure as it exists in the system catalog is definitely affected by the use of spaces or tabs, in these tests, I was unable to detect any practical difference in terms of actual performance. And since the size of the code stored in your databases is typically such a small fraction of the size of the data stored in those same databases, you’d have to have a metric ton of stored procedures chock full of white space to start extending the argument to I/O, backup performance, and so on. So, you can probably throw away both the size and performance arguments for or against one or the other, and stick to other (more qualitative or religious) arguments, when debating tabs and spaces.
See these other tips and resources: