Hash tables are a data structure that stores a key value pair similar to a hash map. Hash tables are similar to hash maps but there are some differences:
- Hash tables are synchronized which is preferable in a multithread application.
- Hash table does not allow null values or keys unlike hash map which allows 1 null key and multiple null values
- It is possible to have a ordered iteration for hash map since a sub class of hash map is LinkedHashMap
Hash table hashes the key which gives us the hash code which is used as the index to store the value. Hashing is basically taking an input then performing some formula which chops/mixes it up to get a new string. A simple hash would be using the modding the number like 28 % 13. The bucket is an array of linked list where the linked list located in the array based on a hash code. This combination is used to handle hash code collisions which is when the hash method produces the same hash code for different keys.
Hash Table Complexity
Best Case/Worst Case
- Space: O(n)/O(n)
- Search: O(1)/O(n)
- Insert: O(1)/O(n)
- Delete: O(1)/O(n)
The best and worst case for space is n which is the amount of data being stored. The best case for search, insert and delete is O(1) which means it instantly finds the location of the value it is looking for but the worst case is O(n). The worst case is if the hashing method results in the same string over and over again for all values so it has to look through everything in the location.
Depending on the hash method it is possible to have the same hash code as the output multiple time like with modding. This is why it is important to have a good hash method to minimize the chances of that happening. But in the case that does happen it is important to have a way to handle it. There are 2 different ways to hand this:
- Each bucket would contain a linked list of elements so any elements with the same hash code would be put in the linked list. This is where the worst case of O(n) comes from where everything is put in the same bucket and linked list.
- Increase the bucket size so the values in the linked list is redistributed. In this case the has method would have to be updated to take into account the new bucket size.
In many situations has tables are faster and more efficient than trees and other tables which is why they are widely used. However, if you are developing a multi thread application you may want to use a hash map instead. Hash tables will inevitably encounter hash collisions when dealing with a large amount of data. A large amount of collisions will result in inefficiency so be prepared to handle that or pick a different data structure. Hash tables also do not allow null values or keys.