Wednesday, April 22, 2009

Convert a string to a table in T-SQL

Summary: Transact-SQL function that converts a string of comma-separated numbers to a table of integers.

Here is a common SQL programming scenario. Your code receives a string holding comma-separated values (CSVs), such as IDs of records stored in a table (say "98,256,17,34"). Now it needs to retrieve these records from the database. How do you do this?

[Note: For the sake of simplicity, I use a comma-separated value (CSV) holding numbers, but the string can contain other values, such as dates, or other strings, which can be separated by semicolons or other delimeters.]

Option #1. Use EXEC (not recommended).
The obvious option would be to generate a dynamic SQL query and execute it via an EXEC statement. For example, you can do the following:
USE AdventureWorks

DECLARE @ProductIDs varchar(256)
DECLARE @Query      varchar(256)

SELECT @ProductIDs = '1,350,400,440'

SELECT @Query =
'select
  ProductID,
  [Name],
  ProductNumber
FROM
  Production.Product
WHERE
  ProductID in (' + @ProductIDs + ')'

exec(@Query)
There are two major problems with this approach. First, generating the query -- in this example it's a SELECT query, but it can be any query -- as a string will degrade readability: you will lose syntax highlighting and IntelliSense (if you're using a new SQL Server 2008 IntelliSense feature or some other third-party tool, such as SQL Prompt). Second, and even more important, this approach is prone to SQL injection attack. Even if you use a command parameter to pass a comma-separated value (CSV) to the stored procedure, since you're simply appending it to dynamic SQL, you must make sure that it does not contain suspicious characters. Whether you do it in T-SQL code or in the application code (C#, or whatever), it's a hassle.

Option #2. Convert string of values to a table of values (recommended).
A better alternative would be to convert the CSV to a table of numbers (again, I use numbers in this example, but the values can be of any type). You can accomplish this with the help of a user-defined function (UDF), such as this one:
CREATE FUNCTION [dbo].[ConvertCsvToNumbers]
(
  @String AS VARCHAR(8000)
)
RETURNS
  @Numbers TABLE (Number INT)
AS
BEGIN
  SELECT @String = 
    LTRIM(
      RTRIM(
        REPLACE(
          ISNULL(@String, ''), '  ' /* tab */, ' ')))

  IF (LEN(@String) = 0)
    RETURN

  DECLARE @StartIdx       INT
  DECLARE @NextIdx        INT
  DECLARE @TokenLength    INT
  DECLARE @Token          VARCHAR(16)

  SELECT  @StartIdx       = 0
  SELECT  @NextIdx        = 1

  WHILE @NextIdx > 0
  BEGIN
    SELECT @NextIdx = CHARINDEX(',', @String, @StartIdx + 1)

    SELECT @TokenLength =
      CASE WHEN @NextIdx > 0 THEN @NextIdx
      ELSE LEN(@String) + 1
    END - @StartIdx - 1

    SELECT @Token = 
      LTRIM(
        RTRIM(
          SUBSTRING(@String, @StartIdx + 1, @TokenLength)))

    IF LEN(@Token) > 0
      INSERT
        @Numbers(Number)
      VALUES
        (CAST(@Token AS INT))

    SELECT @StartIdx = @NextIdx
  END
  RETURN
END
Now you can reference this UDF in a query, such as:
SELECT
  ProductID,
  [Name],
  ProductNumber
FROM
  Production.Product prod,
  dbo.ConvertCsvToNumbers(@ProductIDs) numbers
WHERE
  prod.ProductID = numbers.Number
In addition to improving readability and security, this method lets you treat data contained in the CSV as a regular table. If you want to try this approach, you can download the script generating the UDF, which is more robust than the listing above (the script sets the permissions and contains code for repeated compilation):See also:
How To: Protect From SQL Injection in ASP.NET

2 comments:

  1. thanks for the post.It helped me!

    A post todo reverse conversion from table values to string here would also be good!

    ReplyDelete
  2. Thanks Rama. It's a good idea, but frankly, I never had to go the other way on the SQL side. Normally, the CSV string is generated on the client (in C#/JavaScript/etc).

    ReplyDelete