Hashmaps are THE MOST important data structure for any code interview. Hands down.
They come up in most if not all coding interviews, because they are essential to any performant software. You might know them by other higher level siblings names like dictionaries, maps or vectors.
To understand hashmaps first we have to understand -
Why are arrays cool?
Arrays are a constant N long block in memory. This is why array is instantly accessible (random access) Because its 4 bytes away from memory address of array.
But... arrays have two big flaws. As you may conclude from the previous paragraph, they have to have immutable size due to them having to be a block in memory.
Yet another flaw is that we can access them with only numbers.
Hashmap solves all of this in theory.
By simply hashing your key into a number, with a function.
eg. A is 65 in ascii, Z is 90.
By creating a simple function
create_hash(letter) uppercase letter substract 65 from ascii value of letter return the result of substraction
We get A or a to equal 0 etc.. all the way to Z being 25.
Hashmap a lower level implementation that is often used by the higher level constructs like dictionary- with much more complicated hashing functions that one above. Allowing the key to be arbitrary data.
Of course this solution isn't perfect.
Oh no a collision
What if we want to store two of values that share the same result from the create_hash function?
This is what we conventionally call a collision.
We can handle them in several ways. By expanding the hashmap size, until a collision is impossible. Eg. An array with a byte for every word in the English language.
That solution assumes we know all the data in advance, and will most likely be very memory intensive.
This is why the generally agreed on solution to this problem is storing a pointer to the head of a linked list.
Allowing us to have potentially theoretically infinite space, with best case O(1) - when there is no collisions - and O(n) - assuming everything is a collision - but that would be a very badly designed hash function.
How to search in a hashmap?
Lets say we want to find the value of the key "apple" inside of our hashtable.
Remember hashtable is just a array of pointers. So first we have to get the index of key "apple".
Thats simple, we just run apple by the same hash function we would run it if we were adding it.
Lets say we get the value 5.
That means in the hashtable array, index 5 would contain the value of apple. But remember the hashtable stores just the pointer to the head of a linked list - to avoid collisions. This means we have to traverse this linked list until we find node with the key apple. It might be the first one, or the last one, we don't know.
traverse_list(pointer, key) Go to memory at pointer Is the key of this node "apple" ? Yes: return value of node Does this node have a pointer to the next one? No: return false; // Theres no node with key "apple" Yes: traverse_list(pointer to next one, key)
Simply to traverse the list we check if this node has the key of 'apple'
If so we found it.
If not we have to check if its the end of the list. This would mean node with key 'apple' doesn't exist. And if it has a pointer then we call this function again, recursively with the pointer of the next node.
This algorithm will keep going until there is no more nodes in the linked list. By calling itself over and over, recursively.
How to add values to a hashmap?
The algorithm for adding a value to the list is as complicated as you make it.
You can have all kinds of fancy rules what order the linked list should be. We will go over the easiest case, adding it to the end of the list.
Algorithm is the same as one above, with one difference. When we find a node with no pointer to the next one, we make it point to a new one we create with key of 'apple'
This is a part of a expanding series on data structures. Community at Crunchskills is here to answer any questions you might have. Don't be afraid to ask in the comments below.
Want interview prep in your inbox every week?