Wednesday, January 7, 2009

How to decide Primary Key column data type in MySQL? Expectation with the Primary Key column

Decide the Primary Key column is one of the important tasks. Typically, the Primary Key is must mitigate the below properties.


 


(1) Uniquely identify each row in a table


(2) Can be referenced by another column/row (foreign key)


(3) provide a "natural" sequencing of the data (alphabetical, numeric, and chronological)


(4) Partition the data horizontally for performance


 


If your interest is either "unique identifier" or "natural sequencing", you will be using whatever data type fits for the rows natural identifier. So that pretty much answers the question, and should be your first choice.


 


Remember that a primary key must be immutable -- never changes -- so using a natural key may not be possible. In that case, a surrogate key becomes a necessity.


 


A surrogate key will probably break your "natural sequencing", since the data will be indexed on the surrogate key value, not on the natural values. This doesn't mean to avoid it, but only to recognize that condition and address it in your design (an index, for example, restores the sequencing but adds overhead to update/delete/insert operations).


 


Also, if you are looking at this as a performance problem, you may want to substitute an integer data type as a "surrogate key" for the natural primary key.


 


If you use a surrogate key, use an integer that is sized to the native size of the platform (32-bit, 64-bit, 128-bit integer) up to the size needed to support uniqueness.


 


I would avoid integers which increment monotonically (by 1) -- this presents a "bottleneck" in the processing, making it complicated to run multiple, parallel, concurrent insertions. If you're running an OLTP system, that bottleneck will limit your performance and scalability.


 


So ultimately to decide the primary key data type we need to project the data growth, number of insertion, number of selection and performance issues.


 


For the normal table you can use primary key datatype as BIG INT with AUTO_INCREMENT option.

3 comments:

  1. Hi,

    Nice article. and great explanation.
    I am DBA with 12 years of experience, will be the concept is same in SQL Also?

    ReplyDelete
  2. Yes most probably same :) still you need to consider more points while deciding data type

    ReplyDelete
  3. Thanks, that's help me very much in my work

    ReplyDelete