Keys in data modeling

In data modeling, we use keys for data retrieval and easy navigation. In this post we are going to discuss about the different keys used in data modeling such as candidate keys, composite keys, primary keys, alternate keys, surrogate keys and foreign keys and their important.


Candidate key


Candidate key is one or more data elements that uniquely identify an entity instance. For example, IMEI(International Mobile Station Equipment Identity) used to uniquely identify mobile phones. Sometimes it takes more than one data field to uniquely identify an entity instance. For example, Service Period and Date of Purchase are necessary to identify a free service. When more than one data fields makes up a key, it is called as 'composite key'. Therefore, Service Period and Date of Purchase together are a composite candidate key for a Service. 

Candidate key must be unique, for example, the IMEI code cannot be duplicated, it must be unique to each mobile. Also, candidate key cannot be changed in future. It should contain only those data fields that are uniquely identify an entity instance. Every entity must have at least one candidate key but at the same time can have several.


Determining the candidate key:


The candidate key can be identified by using certain criteria,
  • It must contain unique values.
  • It must not contain null values.
  • It contains the minimum number of fields to ensure uniqueness.

Candidate key
CANDIDATE KEY

For example, the entity Person, contains the data fields name, birth date, gender and social security number.  Social security number satisfies all the criteria to become a candidate key. Person Name may contain duplicate values. For example, two person may have same name. Similarly, data field birth date may contain duplicate value. In this case, the data fields name, birth date and gender combined to form a candidate key, which is used to uniquely identify a instance of an entity. These combination of data fields are called as composite key.


Primary key:


Once the candidate keys have been identified, we can now select one to be our primary key. Even though an entity may contain more than one candidate key, we can only select one candidate key to be primary key for an entity. A primary key is a candidate key that is most appropriate to be the main reference key for an entity. As with any candidate key, the primary key must contain unique values, cannot be null and uniquely identify each record in the table. In selecting one candidate key over another as primary key we have to consider the one with the fewest data fields. Also, we need to consider the security, we cannot reveal the sensitive information to many entities in database since, primary key used in many other entities as foreign key(explained below).


Alternate key:



An alternate key is a candidate key that although unique, was not chosen as the primary key, but still can be used to find specific entity instance.


Surrogate key:


A Surrogate key is a primary key that substitutes for a natural key. For example, consider the following table, Person_id is a surrogate key, which has no embedded intelligence and is used by information technology for integration or performance reason,but in real world, there may no such data field used by the business.

Primary key, Alternate key & Surrogate key
PRIMARY, ALTERNATE & SURROGATE KEY

In Person entity, we have two candidate keys, we have to choose a primary key from candidate keys. Primary key cannot be null, must be unique and short in length. We cannot choose the three data fields as primary key since it will not satisfy the required criteria. So we are going for next candidate key Social Security Number, but as per our criteria, we cannot use the sensitive information in many entities in our database as foreign key. So we need to look for an alternative surrogate key. Person-id data field has no embedded intelligence and it is used only by the information technology but there is no such thing in business. It is used only for integeration and performance.


Foreign key:


A foreign key is a data field that is used to connect the another entity. For example, consider the following table, in which the primary key Student_id in entity Student is used as a foreign key in the entity teaching.

The entities Country and Person having the following relationship.Each country contain many person and Each person must belongs to one country.

The entity country contains the foreign key person_id, which is used to link the entity person.

Foreign key
FOREIGN KEY

In next post, let us see about the Conceptual data model. Share your thoughts about keys in comment section.

2 comments:

  1. I have bookmarked your site since this site contains significant data in it. You rock for keeping incredible stuff. I am a lot of appreciative of this site.
    best data science courses

    ReplyDelete
  2. I have bookmarked your site since this site contains significant data in it. You rock for keeping incredible stuff. I am a lot of appreciative of this site.
    best data science courses

    ReplyDelete