Handling Collisions in Hash Tables

C - Hash tables - Separate Chaining

Jul 06, 2023

Intro.

Separate chaining is a common technique used to handle collisions in hash tables. When two or more keys hash to the same index in the hash table, separate chaining involves creating a linked list at that index and inserting the colliding keys into the list.

In a separate chaining hash table, each slot in the table is a pointer to the head of a linked list. When a new key-value pair is inserted into the table, its key is hashed to an index in the table, and the pair is inserted at the head of the linked list at that index. If a collision occurs, the new pair is simply appended to the linked list at that index.

When searching for a key in a separate chaining hash table, the key is hashed to an index, and then a linear search is performed through the linked list at that index to find the matching key. If the key is not found in the linked list at that index, it does not exist in the hash table.

Separate chaining has several advantages. It is simple to implement and allows the hash table to accommodate an arbitrary number of key-value pairs. It also has good performance characteristics, with an expected constant-time complexity for insertion, deletion, and lookup operations, assuming that the hash function distributes the keys uniformly across the table.

However, separate chaining also has some drawbacks. It requires extra memory to store the linked lists, and searching through the linked lists can be slower than searching through an array. Additionally, if the hash function produces a poor distribution of keys, the linked lists can become very long, degrading the performance of the hash table.

Insert Item Node to a Linked List:

Assume struct with which we will allocate hash table items: (item carries key and value “see post in Reference section”)

typedef struct LinkedList

{

H_item *item;

struct LinkedList *next;

} LinkedList;

Overflow Bucket:

An overflow bucket is a data structure used in hash tables to handle collisions when a hash function maps two or more keys to the same index.

Overflow buckets can be implemented as a linked list or other data structure that allows for efficient insertion and retrieval of key-value pairs.

An overflow bucket solves this problem by allowing a single index in the hash table to hold multiple key-value pairs. When a collision occurs, the new key-value pair is added to the overflow bucket at the index, rather than overwriting the existing pair.

All what we need is addition of it to update Hash table struct like :

// Defines the HashTable.
struct HashTable
{
    // Contains an array of pointers to items.
    Ht_item** items;
    LinkedList** overflow_buckets;
    int size;
    int count;
};

Create buckets and initialize each with NULL:

Leverage hash table to contain these buckets:

    table->overflow_buckets = create_bucket(table);

Now rewrite insert to table function using separate chaining that avoids collisions in hash table :

The function first computes the hash code for the key using the hash function. It then locates the linked list at the hashed index in the main hash table. If the key already exists in the linked list, the function updates its value. If the key does not exist in the linked list, the function creates a new key-value pair using the create_item function, creates a new node in the linked list using the create_node function, and adds the new node to the head of the linked list.

Rights reserved - Handling collisions in Hash table using linked lists

You may want to check :

O(1) Hash Tables in C

Shaza Ali

July 6, 2023

Read full story

Shaza’s Substack

O(1) Hash Tables in C

Discussion about this post