How to do parallel programming in Python?
Categories:
Unlocking Concurrency: A Guide to Parallel Programming in Python

Explore the fundamentals of parallel programming in Python, understanding the Global Interpreter Lock (GIL) and leveraging modules like multiprocessing
and threading
for efficient concurrent execution.
Python, often celebrated for its simplicity and readability, presents unique challenges and opportunities when it comes to parallel programming. While the Global Interpreter Lock (GIL) can limit true parallel execution of CPU-bound tasks within a single process, Python offers robust modules like multiprocessing
and threading
to achieve concurrency and parallelism. This article will guide you through the core concepts, practical implementations, and best practices for writing efficient parallel Python code.
Understanding Concurrency vs. Parallelism and the GIL
Before diving into code, it's crucial to distinguish between concurrency and parallelism. Concurrency is about dealing with many things at once (e.g., multitasking on a single core), while parallelism is about doing many things at once (e.g., using multiple cores simultaneously). Python's GIL is a mutex that protects access to Python objects, preventing multiple native threads from executing Python bytecodes at once. This means that even on multi-core systems, a single Python process can only execute one thread at a time for CPU-bound tasks. However, for I/O-bound tasks, where threads spend most of their time waiting for external resources, the GIL is released, allowing other threads to run.
flowchart TD A[Python Program Start] B{Task Type?} C[CPU-Bound Task] D[I/O-Bound Task] E[GIL Acquired] F[GIL Released] G[Single Thread Execution] H[Multiple Threads (Concurrent)] I[Multiprocessing (Parallel)] J[Program End] A --> B B -->|CPU-Bound| C B -->|I/O-Bound| D C --> E E --> G D --> F F --> H G --> J H --> J C --> I I --> J
Decision flow for Python concurrency and parallelism based on task type.
Achieving Parallelism with multiprocessing
For CPU-bound tasks, the multiprocessing
module is your go-to solution. It bypasses the GIL by spawning new processes, each with its own Python interpreter and memory space. This allows true parallel execution across multiple CPU cores. The module provides a Process
class for creating individual processes and a Pool
class for managing a pool of worker processes, which is ideal for applying a function to a large dataset in parallel.
import multiprocessing
import os
def square(number):
print(f"Process ID: {os.getpid()} - Squaring {number}")
return number * number
if __name__ == "__main__":
numbers = [1, 2, 3, 4, 5]
# Create a Pool of worker processes
with multiprocessing.Pool(processes=4) as pool:
# Map the square function to the numbers list
results = pool.map(square, numbers)
print(f"\nOriginal numbers: {numbers}")
print(f"Squared results: {results}")
Example of using multiprocessing.Pool
for parallel execution of a CPU-bound task.
multiprocessing
code within an if __name__ == "__main__":
block. This is crucial on Windows and some Unix systems to prevent child processes from recursively importing the main script, leading to infinite process creation.Achieving Concurrency with threading
The threading
module allows you to run multiple functions concurrently within the same process. Due to the GIL, this is most effective for I/O-bound tasks, such as network requests, file operations, or database queries. While one thread is waiting for an I/O operation to complete, the GIL is released, allowing another thread to execute Python bytecode. This can significantly improve the responsiveness and throughput of applications that spend a lot of time waiting.
import threading
import time
def fetch_url(url):
print(f"Starting to fetch {url}...")
time.sleep(2) # Simulate network request
print(f"Finished fetching {url}")
urls = [
"http://example.com/page1",
"http://example.com/page2",
"http://example.com/page3"
]
threads = []
for url in urls:
thread = threading.Thread(target=fetch_url, args=(url,))
threads.append(thread)
thread.start()
for thread in threads:
thread.join()
print("All URLs fetched.")
Example of using threading
for concurrent execution of I/O-bound tasks.
threading
module provides various synchronization primitives to manage shared resources safely.