Conditional INSERT/UPDATE Race Condition

Conditional INSERT/UPDATE Race Condition

 

 

I often see conditional INSERT/UPDATE code like:

CREATE PROCEDURE dbo.Insert_Or_Update_Foo

      @ID int,

      @Bar int

AS

 

SET NOCOUNT ON

 

IF EXISTS(SELECT * FROM dbo.Foo WHERE ID = @ID)

BEGIN

      UPDATE dbo.Foo

      SET bar = @bar

      WHERE ID = @ID

END

ELSE

BEGIN

      INSERT INTO dbo.Foo (ID, Bar)

      VALUES (@ID, @Bar)

END

 

RETURN @@ERROR

 

Despite its prevalence, this “UPSERT” code is fundamentally flawed because it assumes only a single user will execute the proc at a time.  A primary key violation error can occur when executed simultaneously on different connections with the same @ID value instead of the intended UPDATE. 

An even worse situation is when a surrogate key (e.g. IDENTITY) is used as the primary key and the conditional insert is based on a natural key that has no unique constraint or unique index.  In that case, the both INSERTs will succeed but duplicate data (same natural key) will be inserted.  A unique constraint (or unique index) should of course be specified on every key but that’s sometimes overlooked.

You can test the concurrency problem yourself by creating the following table for use with the above stored procedure on a multi-processor box:

CREATE TABLE dbo.Foo

(

      ID int NOT NULL

            CONSTRAINT PK_Foo PRIMARY KEY,

      Bar int NOT NULL

)

 

Then run the following script on different database connections after changing the WAITFOR TIME to the near future:

WAITFOR TIME '08:00:00'

 

EXEC dbo.Insert_Or_Update_Foo

      @ID = 1,

      @Bar = 1

 

I ran this test and got a primary key violation because parallel execution of the IF EXISTS test returned false on both connections so the INSERT was attempted on both. 

Fortunately, there’s a simple solution to prevent the error:  1) add an explicit transaction and 2) specify SELECT locking hints.  Below is a modified version of the original stored procedure with these changes:

CREATE PROCEDURE dbo.Insert_Or_Update_Foo

      @ID int,

      @Bar int

AS

 

SET NOCOUNT, XACT_ABORT ON

 

BEGIN TRAN

 

IF EXISTS(SELECT * FROM dbo.Foo WITH (UPDLOCK, HOLDLOCK) WHERE ID = @ID)

BEGIN

      UPDATE dbo.Foo

      SET bar = @bar

      WHERE ID = @ID

END

ELSE

BEGIN

      INSERT INTO dbo.Foo (ID, Bar)

      VALUES (@ID, @Bar)

END

 

COMMIT

 

RETURN @@ERROR

 

The UPDLOCK hint instructs SQL Server to use an update lock instead of the shared lock that would normally be acquired for the SELECT.  HOLDLOCK is needed in the default READ COMMITTED isolation level to ensure that the lock is held until the end of the transaction.  This more restrictive update lock will prevent simultaneous queries from either selecting or changing the locked resource.  If the row exists during the SELECT, the locked resource is the existing row.  If no row exists with the specified key, a range lock on the primary key is acquired to prevent inserts of the same key until the lock is released. 

Although not actually required, I also added SET XACT_ABORT ON to help ensure the explicit transaction is closed following a timeout or unexpected error.  See Use Caution with Explicit Transactions in Stored Procedures to see why I recommend this as a standard practice.

I should add that this explicit transaction and locking hints technique will prevent the above mentioned problems but it doesn’t technically prevent a race condition.  The issue with any “UPSERT”/MERGE technique is that the last one in the race wins.  If you run the above proc simultaneously on different connections with the same @ID value but different @Bar values, the last Bar value UPDATE will of course overwrite any previous Bar value.

The issue doesn’t apply only to IF statements.  Even intra-query subqueries like the example below can fall victim to the issue.  Just like IF EXISTS, there is nothing to prevent the same concurrency problem when the NOT EXISTS predicate is evaluated simultaneously on different connections.

CREATE PROCEDURE dbo.Insert_Foo

      @ID int,

      @Bar int

AS

 

SET NOCOUNT ON

 

INSERT INTO dbo.Foo (ID, Bar)

VALUES (@ID, @Bar)

WHERE NOT EXISTS

      (

      SELECT *

      FROM dbo.Foo

      WHERE ID = @ID

      )

 

RETURN @@ERROR

 

Below is the modified proc with the transaction and locking hint changes:

CREATE PROCEDURE dbo.Insert_Foo

      @ID int,

      @Bar int

AS

 

SET NOCOUNT, XACT_ABORT ON

 

BEGIN TRAN

 

INSERT INTO dbo.Foo (ID, Bar)

VALUES (@ID, @Bar)

WHERE NOT EXISTS

      (

      SELECT *

      FROM dbo.Foo WITH (UPDLOCK, HOLDLOCK)

      WHERE ID = @ID

      )

 

COMMIT

 

RETURN @@ERROR

 

I’m not certain if the SQL 2008 MERGE statement suffers from the same concurrency issues since I don’t currently have an adequate (multi-processor) SQL 2008 test environment for the test.  I plan to install the next SQL 2008 CTP on a real (not virtual) machine and test the proc below.  I’ll post the results.

MERGE Stored Procedure

posted @ Sunday, October 28, 2007 10:29 AM

Print

Comments on this entry:

# re: Conditional INSERT/UPDATE Race Condition

Left by Dan Guzman at 10/31/2007 8:08 AM
Gravatar
Yes, there are also issues with a SERIALIZABLE isolation level (client or Transact-SQL) because the SELECT will acquire only shared locks by default. If you run the first proc example with SERIALIZABLE at the same time by 2 different sessions, you'll get a deadlock because both selects will return false but each INSERT will wait on the other session's shared lock from the SELECT.
In the SERIALIZABLE isolation level, you can either catch the deadlock error and retry or specify an UPDLOCK hint on the SELECT. Unlike READ COMMITTED, HOLDLOCK doesn't need to be specified because SQL Server implements SERIALIZABLE by holding locks.

--
Dan

# re: Conditional INSERT/UPDATE Race Condition

Left by Damian Mulvena at 11/19/2007 12:32 PM
Gravatar
how does this construct compare?

INSERT INTO dbo.Foo (
ID,
Bar
)
SELECT ID,
BAR
FROM (
SELECT @ID AS ID,
@Bar AS BAR
) x
LEFT JOIN dbo.Foo f
on f.ID = x.ID
where f.ID is null

# re: Conditional INSERT/UPDATE Race Condition

Left by Dan Guzman at 11/20/2007 7:31 AM
Gravatar
In the READ_COMMITTED isolation level, SQL Server releases locks when no longer needed unless the lock protects a modified resource. HOLDLOCK changes the lock retention behavior to SERIALIZABLE.

The documentation you mention is correct if the isolation level is SERIALIZABLE or REPEATABLE READ. The whole subject of isolation levels and locking hints is a confusing. I think the Books Online should discuss lock behavior separately in the context of diffferent isolation levels for clarity.

# re: Conditional INSERT/UPDATE Race Condition

Left by Dan Guzman at 11/20/2007 10:43 PM
Gravatar
Damian, I tested the INSERT...LEFT JOIN method and it seems less likely to get the primary key violation than the INSERT...NOT EXISTS technique without the hints. I think the only way to avoid the PK violation entirely is with the UPDLOCK hint.

# re: Conditional INSERT/UPDATE Race Condition

Left by Paul Chu at 1/19/2009 12:32 PM
Gravatar
Very nice article Dan !

Could you comment on if the Sql Server 2008 Merge Statement needs the UpdLock hints and if so where do we add the Hints
??? Merge dbo.Foo (UpdLock, HoldLock)

and how would we do a concurrent test ? on laptop with 2 processors ?


create table dbo.Foo
(

ID int NOT NULL
constraint PF_Foo Primary Key,
Bar int not null
)

go

create procedure dbo.Merge_Foo
@ID int, @Bar int
AS

Set NOCOUNT, XACT_ABORT on;

Merge dbo.Foo (UpdLock, HoldLock) as f
Using ( Select @ID as ID, @Bar as Bar ) as new_foo
on f.ID = new_foo.ID
when MATCHED Then
Update Set f.Bar = new_foo.Bar
when NOT MATCHED Then
Insert ( ID, Bar )
Values ( new_foo.ID, new_foo.Bar );

return @@error;

# re: Conditional INSERT/UPDATE Race Condition

Left by Dan Guzman at 1/24/2009 1:35 PM
Gravatar
Hi, Paul.

Thanks for the question. This motivated me to run the MERGE test I had planned long ago.

The MERGE statement also needs the locking hints (or higher transaction isolation level) to avoid the same race condition as do conditional insert/update statements. I ran a variation of your MERGE proc on a SQL 2008 quad core processor box and got a duplicate primary key violation. I think that any multi-cpu/core machine should do fine for such a test as long as there is no other significant activity.

Note that the WITH keyword is required to specify locking hints with MERGE. For example:

MERGE dbo.Foo WITH (UPDLOCK, HOLDLOCK)...

WITH is optional in some other DML statements but I recommend specifying it even when optional for consistency.

I'll elaborate a bit more the MERGE results in my next post.

# re: Conditional INSERT/UPDATE Race Condition

Left by Pat at 2/17/2009 12:28 PM
Gravatar
Nice article Dan, just had a very similar question in an interview. One point on the need for HOLDLOCK that Tom Grant raised, I maybe missing something in your scenario, but the UPDLOCK hint will hold until the end of the transaction with READ_COMMITTED. This is certainly my experince with SQL2000/2005.

BEGIN TRAN

SELECT *
FROM Foo WITH ( UPDLOCK )
WHERE FooId = 99999

EXEC sp_lock @@spid
COMMIT


BEGIN TRAN

SELECT *
FROM Foo WITH ( UPDLOCK,HOLDLOCK )
WHERE FooId = 99999

EXEC sp_lock @@SPID
COMMIT

Thoughts?

Thanks

Pat

# re: Conditional INSERT/UPDATE Race Condition

Left by Dan Guzman at 2/19/2009 7:33 AM
Gravatar
Hi, Pat.

Executing the SELECT with a transaction and with the UPDLOCK hint alone effectively changes the isolation level from READ COMMITTED (which releases locks) to REPEATABLE READ (which retains update locks to prevent read data from being changed but allows concurrent inserts). Adding both UPDLOCK and HOLDLOCK hints effectively changes the isolation level to SERIALIZABLE (which prevents updates to rows read and inserts of new rows). I should have been clearer about this in my earlier respone.

In the case of a conditional inserts only, one can get away with just the HOLDLOCK hint; the UPDLOCK is needed only when data are updated.

Below are traces of the lock acquired and released events of your queries. I first ran against an empty table and then again with a single row in the table (key value 1).

Empty table with only UPDLOCK:
Lock:Acquired 5 - OBJECT 8 - IX

Empty table with UPDLOCK and HOLDLOCK:
Lock:Acquired 5 - OBJECT 8 - IX
Lock:Acquired (ffffffffffff) 7 - KEY 14 - RangeS-U

No range lock is acquired without the HOLDLOCK. Nothing will prevent another session from inserting data with that because only the object level intent lock is held during the transaction.

Now take a look at the trace of the same queries against a non-empty table:

Non-empty table with only UPDLOCK:
Lock:Acquired 5 - OBJECT 8 - IX
Lock:Acquired 1:155 6 - PAGE 7 - IU
Lock:Released 1:155 6 - PAGE 0 - NULL

Non-empty table with UPDLOCK and HOLDLOCK:
Lock:Acquired 5 - OBJECT 8 - IX
Lock:Acquired 1:155 6 - PAGE 7 - IU
Lock:Acquired (ffffffffffff) 7 - KEY 14 - RangeS-U

Again, although intent locks are acquired and held without HOLDLOCK, this will not prevent concurrent inserts.

# re: Conditional INSERT/UPDATE Race Condition

Left by Mike Good at 3/30/2009 4:10 PM
Gravatar
I see I'm not the first visitor to think UPDLOCK alone is good enough, no need for the HOLDLOCK.

I think Dan's point is that adding the HOLDLOCK will prevent other INSERT statements (ones not coded per this article), from causing trouble.

If we know that *ALL* INSERTs will be coded per this article, then HOLDLOCK is truly not required. Flow of control gets held up at the SELECT statment, and there is no possibility for a problem insert to sneak in. Right?

My first reaction to this article was to think "Nice, but there's really no need for the HOLDLOCK." I've successfully used this approach with only UPDLOCK for many years. But on further reflection, adding HOLDLOCK seems like cheap insurance, and I don't see any downside beyond having to explain it.

Good articles here. Thanks.

# re: Conditional INSERT/UPDATE Race Condition

Left by Jerry at 8/17/2009 9:06 AM
Gravatar
I have followed the example here with slight alteration to the select/insert sequence. (The reason is I need to use the PK later in the StoredProcedure) However, with HOLDLOCK specified (since I do not need to update), I got the deadlock message on one of the transactions. Here is my example:

SET NOCOUNT, XACT_ABORT ON

DECLARE @FooID INT

BEGIN TRAN

SELECT @FooID = ID FROM dbo.Foo WITH (HOLDLOCK) WHERE Bar=@Bar

IF @FooID IS NULL

BEGIN

INSERT INTO dbo.Foo(ID, Bar) VALUES(@Bar)

SELECT @FooID = @@IDENTITY

END

COMMIT

I do not believe I need the UPLOCK hint as I am not doing insert. But from reading MSDN looks like HOLDLOCK is only a shared lock. In this case, do I need to use TABLELOCKX?



# re: Conditional INSERT/UPDATE Race Condition

Left by Ed at 2/9/2012 4:38 AM
Gravatar
u did not fixed the problem. U just lowered the chance to make a duplicate insert. updlock will not place any locks if data is not there, so if statement will pass by and the thread will be waiting for insert statement. In your exercise you thought you fixed the problem because you just tried with 2 threads, and the if statement did not occure in the exaclty same time.

# re: Conditional INSERT/UPDATE Race Condition

Left by guzmanda at 2/9/2012 9:26 PM
Gravatar
Ed,

The UPDLOCK, HOLDLOCK hint acquires a shared range update lock, which will prevent concurrent inserts within the specified key range. Consider the following:

BEGIN TRAN;
SELECT * FROM dbo.Foo WITH (UPDLOCK,HOLDLOCK) WHERE ID = 1;
EXEC sp_lock @@SPID;
ROLLBACK;

The above script against an empty table shows the following lock which will block inserts into the table until the transaction is committed or rolled back.

RangeS-U (ffffffffffff)

# re: Conditional INSERT/UPDATE Race Condition

Left by Dave at 10/25/2012 1:35 PM
Gravatar
This post, as well as the comments section has been extremely helpful. Thanks for the contribution.
Comments have been closed on this topic.