Monday, April 14, 2008

GUID vs. int (Uniqueidentifier vs. identity

I got interested in GUID's when I saw it was used in the ASP.NET membership database schema. At the first glance, 0D40FC92-4F1D-42D1-9A9B-860526C21FF0 didn't seem very nice, but eventually I realized how flexible and effective the GUID datatype is. So let's have a look at why you should use it instead of identities (int)
Let's start by identifying the problems with identity columns (int):
- By default, inserting values into identity columns are not allowed.
- If you try to copy data from one database to another, the identity value is most probably a duplicate as it's already present in the target database.
- No way to know the identity value beforehand until the row is actually inserted into the table.
Using GUID's solves all this:
- In 99,99% of the cases a GUID is unique, due to this you won't see the same value in the target database.
- Since a GUID can be generated in the application layer, like in .NET, it can be generated at any time, even before it's inserted into the database.
So what are the disadvantages of GUIDs'?
- Less readable than int and doesn't look nice in URL's. As a solution to the latter you can use url-rewriting.
- Requires more storage as they are 16 bytes in size instead of 4 bytes for int. But for my own experience I haven't noticed any performance difference when switching to GUID, they perform really well.
To generate a new GUID using MS-SQL Server you use the newid() method, and in .NET you use the GUID.NewGUID().
Since there is no similiar method to scope_identity() or @@identity when using GUID's, then the best way is to generate the value in a stored procedure or in your application code, instead of having newid() as the default value for your GUID-column.