slouch's blog

Today I Remember - Python Concurrency

Python is a powerful and versatile language, but there is one thing it is not so great at: multithreading. If you have ever tried to write a Python script that spawns multiple threads to do work in parallel, you may have run into issues with deadlocks, race conditions, or other concurrency bugs.

But fear not - there is a solution! The concurrent.futures module in Python 3 makes multithreading much more manageable, and can help you avoid many of the common pitfalls that come with concurrent programming.

At its core, concurrent.futures is a high-level interface for managing pools of threads or processes in Python. It provides a simple and intuitive API for executing functions asynchronously and collecting their results.

One of the key benefits of concurrent.futures is that it abstracts away many of the low-level details of thread and process management, such as thread synchronization, task scheduling, and inter-process communication. This makes it much easier to write concurrent code that is correct, efficient, and maintainable.

To get started with concurrent.futures, you can simply import the module and create an instance of the ThreadPoolExecutor or ProcessPoolExecutor class, depending on whether you want to use threads or processes. You can then submit tasks to the executor using the submit() method, and collect the results using the result() method.

For example, let us say you have a list of URLs that you want to download in parallel. You could use concurrent.futures to spawn a pool of worker threads to download the URLs concurrently, like so:

import requests
from concurrent.futures import ThreadPoolExecutor

urls = [
    'https://www.example.com',
    'https://www.google.com',
    'https://www.github.com'
]

def download_url(url):
    response = requests.get(url)
    return response.content

with ThreadPoolExecutor(max_workers=3) as executor:
    futures = [executor.submit(download_url, url) for url in urls]
    results = [future.result() for future in futures]

This code creates a pool of three worker threads using a ThreadPoolExecutor, and submits a task to download each URL using the submit() method. The result() method is then used to collect the results of each task.

By using concurrent.futures, you can write concurrent Python code that is more readable, more maintainable, and less prone to bugs than traditional threading or multiprocessing approaches. If you are interested in learning more, check out the official documentation1 or this excellent tutorial2

#today-i-remember