23 August 2010

Joining SQL Server Tables on Composite Column Values

I am working with a database server that hearkens from the not-too-distant past, where disk space was extremely costly.  It made sense to combine several fields (columns), to make the file (table) more compact.  The values were split and evaluated by mainframe code, which was less costly, because the logic and database were on the same machine.  These are the fiery brimstone of relational database hell.


The database is being transferred to a new SQL Server, where disk storage costs pennies for the gigabyte, and relational architecture reigns supreme.  These composite columns present a challenge, because only a portion of their value must be used to join another table -- not exactly something that Query Builder can handle.

The following Orders and Products tables must be joined. The middle 3 digits of Orders.InvoiceNumber match Customers.CustomerId:

CREATE TABLE [dbo].[Orders](
    [OrderId] [int] NOT NULL,
    [InvoiceNumber] [varchar(50)] NOT NULL,
    [OrderDate] [datetime] NOT NULL)

CREATE TABLE [dbo].[Customers](
    [CustomerId] [int] NOT NULL,
    [CustomerName] [varchar](50) NOT NULL)

To join these tables, we simply use the SUBSTRING() method, to extract the middle 3 characters of InvoiceNumber. The resulting Varchar typed data may then be implicitly cast to int:

SELECT *
FROM Orders o INNER JOIN Customers c 
    ON CAST(SUBSTRING(o.InvoiceNumber, 4, 3) AS int) 
    = c.CustomerId

As always, I recommend saving queries as stored procedures, to leverage the security and performance benefits of calling pre-compiled code.

1 comment:

Please provide details, when posting technical comments. If you find an error in sample code or have found bad information/misinformation in a post, please e-mail me details, so I can make corrections as quickly as possible.