Hash Map in Python
Categories:
Understanding Python Hash Maps (Dictionaries)

Explore the fundamental concepts, implementation, and practical applications of hash maps (dictionaries) in Python, a crucial data structure for efficient data storage and retrieval.
In Python, the dict
type is an implementation of a hash map (also known as a hash table or associative array). It's one of the most versatile and frequently used data structures, offering highly efficient ways to store and retrieve data using unique keys. Understanding how hash maps work under the hood is crucial for writing optimized and robust Python code.
What is a Hash Map?
A hash map is a data structure that implements an associative array abstract data type, a structure that can map keys to values. A hash map uses a hash function to compute an index into an array of buckets or slots, from which the desired value can be found. Ideally, the hash function will assign each key to a unique bucket, but most hash map designs accommodate hash collisions (when the hash function generates the same index for more than one key).
flowchart TD A[Key] --> B{Hash Function} B --> C[Hash Value (Index)] C --> D[Array/Buckets] D --> E[Value]
Basic concept of how a hash map works
Python's dict
as a Hash Map
Python's built-in dict
type is a highly optimized hash map implementation. It stores key-value pairs, where keys must be hashable (immutable types like strings, numbers, tuples) and values can be any Python object. The efficiency of dict
operations (average O(1) for insertion, deletion, and lookup) makes it indispensable for many programming tasks.
# Creating a dictionary (hash map)
my_dict = {
"name": "Alice",
"age": 30,
"city": "New York"
}
# Accessing values
print(my_dict["name"]) # Output: Alice
# Adding a new key-value pair
my_dict["occupation"] = "Engineer"
print(my_dict) # Output: {'name': 'Alice', 'age': 30, 'city': 'New York', 'occupation': 'Engineer'}
# Updating a value
my_dict["age"] = 31
print(my_dict) # Output: {'name': 'Alice', 'age': 31, 'city': 'New York', 'occupation': 'Engineer'}
# Deleting a key-value pair
del my_dict["city"]
print(my_dict) # Output: {'name': 'Alice', 'age': 31, 'occupation': 'Engineer'}
Basic operations with Python dictionaries
__hash__
method and an __eq__
method, and their hash value must remain constant throughout their lifetime. Immutable types like strings, numbers, and tuples are hashable. Mutable types like lists and dictionaries are not.Hash Collisions and Resolution
Hash collisions occur when two different keys produce the same hash value. Python's dict
handles collisions using a technique called open addressing, specifically a variant of probe sequencing. When a collision occurs, the dictionary probes for the next available slot in the internal array. This process ensures that all key-value pairs can be stored and retrieved correctly, even if their hash values are identical.
flowchart LR A[Key1] --> B{Hash Function} C[Key2] --> B B --> D[Hash Value X] D --> E{Bucket X Occupied?} E -- Yes --> F[Probe Next Slot] F --> G[Store Key-Value Pair] E -- No --> G
Simplified collision resolution using probing
Performance Characteristics
Python dictionaries offer excellent average-case performance for common operations:
- Insertion (adding a key-value pair): O(1) on average.
- Deletion (removing a key-value pair): O(1) on average.
- Lookup (accessing a value by key): O(1) on average.
In the worst-case scenario (e.g., many hash collisions leading to a long chain of probes), these operations can degrade to O(n), where n is the number of items in the dictionary. However, Python's dict
implementation is highly optimized to minimize such occurrences through good hash functions and resizing strategies.
TypeError: unhashable type
or unexpected behavior if the object's hash changes after being inserted. Always use immutable types for dictionary keys.