Deep Dive into Multithreading, Multiprocessing, and Asyncio

-

Multithreading allows a process to execute multiple threads concurrently, with threads sharing the identical memory and resources (see diagrams 2 and 4).

Nevertheless, Python’s Global Interpreter Lock (GIL) limits multithreading’s effectiveness for CPU-bound tasks.

Python’s Global Interpreter Lock (GIL)

The GIL is a lock that permits just one thread to carry control of the Python interpreter at any time, meaning just one thread can execute Python bytecode directly.

The GIL was introduced to simplify memory management in Python as many internal operations, similar to object creation, will not be thread protected by default. And not using a GIL, multiple threads attempting to access the shared resources would require complex locks or synchronisation mechanisms to forestall race conditions and data corruption.

When is GIL a bottleneck?

  • For single threaded programs, the GIL is irrelevant because the thread has exclusive access to the Python interpreter.
  • For multithreaded I/O-bound programs, the GIL is less problematic as threads release the GIL when waiting for I/O operations.
  • For multithreaded CPU-bound operations, the GIL becomes a major bottleneck. Multiple threads competing for the GIL must take turns executing Python bytecode.

An interesting case price noting is using time.sleep, which Python effectively treats as an I/O operation. The time.sleep function shouldn’t be CPU-bound since it doesn’t involve energetic computation or the execution of Python bytecode through the sleep period. As a substitute, the responsibility of tracking the elapsed time is delegated to the OS. During this time, the thread releases the GIL, allowing other threads to run and utilise the interpreter.

Multiprocessing enables a system to run multiple processes in parallel, each with its own memory, GIL and resources. Inside each process, there could also be a number of threads (see diagrams 3 and 4).

Multiprocessing bypasses the restrictions of the GIL. This makes it suitable for CPU certain tasks that require heavy computation.

Nevertheless, multiprocessing is more resource intensive as a result of separate memory and process overheads.

Unlike threads or processes, asyncio uses a single thread to handle multiple tasks.

When writing asynchronous code with the asyncio library, you will use the async/await keywords to administer tasks.

Key concepts

  1. Coroutines: These are functions defined with async def . They’re the core of asyncio and represent tasks that will be paused and resumed later.
  2. Event loop: It manages the execution of tasks.
  3. Tasks: Wrappers around coroutines. If you desire a coroutine to really start running, you switch it right into a task — eg. using asyncio.create_task()
  4. await : Pauses execution of a coroutine, giving control back to the event loop.

How it really works

Asyncio runs an event loop that schedules tasks. Tasks voluntarily “pause” themselves when waiting for something, like a network response or a file read. While the duty is paused, the event loop switches to a different task, ensuring no time is wasted waiting.

This makes asyncio ideal for scenarios involving many small tasks that spend quite a lot of time waiting, similar to handling hundreds of web requests or managing database queries. Since all the pieces runs on a single thread, asyncio avoids the overhead and complexity of thread switching.

The important thing difference between asyncio and multithreading lies in how they handle waiting tasks.

  • Multithreading relies on the OS to change between threads when one thread is waiting (preemptive context switching).
    When a thread is waiting, the OS switches to a different thread robotically.
  • Asyncio uses a single thread and depends upon tasks to “cooperate” by pausing once they must wait (cooperative multitasking).

2 ways to write down async code:

method 1: await coroutine

If you directly await a coroutine, the execution of the current coroutine pauses on the await statement until the awaited coroutine finishes. Tasks are executed sequentially throughout the current coroutine.

Use this approach if you need the results of the coroutine immediately to proceed with the following steps.

Although this might sound like synchronous code, it’s not. In synchronous code, the whole program would block during a pause.

With asyncio, only the present coroutine pauses, while the remainder of this system can proceed running. This makes asyncio non-blocking at this system level.

Example:

The event loop pauses the present coroutine until fetch_data is complete.

async def fetch_data():
print("Fetching data...")
await asyncio.sleep(1) # Simulate a network call
print("Data fetched")
return "data"

async def essential():
result = await fetch_data() # Current coroutine pauses here
print(f"Result: {result}")

asyncio.run(essential())

method 2: asyncio.create_task(coroutine)

The coroutine is scheduled to run concurrently within the background. Unlike await, the present coroutine continues executing immediately without waiting for the scheduled task to complete.

The scheduled coroutine starts running as soon because the event loop finds a possibility, while not having to attend for an explicit await.

No latest threads are created; as a substitute, the coroutine runs throughout the same thread because the event loop, which manages when each task gets execution time.

This approach enables concurrency throughout the program, allowing multiple tasks to overlap their execution efficiently. You’ll later must await the duty to get it’s result and ensure it’s done.

Use this approach when you must run tasks concurrently and don’t need the outcomes immediately.

Example:

When the road asyncio.create_task() is reached, the coroutine fetch_data() is scheduled to begin running immediately when the event loop is accessible. This will occur even before you explicitly await the duty. In contrast, in the primary await method, the coroutine only starts executing when the await statement is reached.

Overall, this makes this system more efficient by overlapping the execution of multiple tasks.

async def fetch_data():
# Simulate a network call
await asyncio.sleep(1)
return "data"

async def essential():
# Schedule fetch_data
task = asyncio.create_task(fetch_data())
# Simulate doing other work
await asyncio.sleep(5)
# Now, await task to get the result
result = await task
print(result)

asyncio.run(essential())

Other small print

  • You’ll be able to mix synchronous and asynchronous code.
    Since synchronous code is obstructing, it may well be offloaded to a separate thread using asyncio.to_thread(). This makes your program effectively multithreaded.
    In the instance below, the asyncio event loop runs on the essential thread, while a separate background thread is used to execute the sync_task.
import asyncio
import time

def sync_task():
time.sleep(2)
return "Accomplished"

async def essential():
result = await asyncio.to_thread(sync_task)
print(result)

asyncio.run(essential())

  • It’s best to offload CPU-bound tasks that are computationally intensive to a separate process.

This flow is an excellent way to make a decision when to make use of what.

Flowchart (drawn by me), referencing this stackoverflow discussion
  1. Multiprocessing
    – Best for CPU-bound tasks that are computationally intensive.
    – When you want to bypass the GIL — Each process has it’s own Python interpreter, allowing for true parallelism.
  2. Multithreading
    – Best for fast I/O-bound tasks because the frequency of context switching is reduced and the Python interpreter sticks to a single thread for longer
    – Not ideal for CPU-bound tasks as a result of GIL.
  3. Asyncio
    – Ideal for slow I/O-bound tasks similar to long network requests or database queries since it efficiently handles waiting, making it scalable.
    – Not suitable for CPU-bound tasks without offloading work to other processes.

That’s it folks. There’s lots more that this topic has to cover but I hope I’ve introduced to you the assorted concepts, and when to make use of each method.

Thanks for reading! I write commonly on Python, software development and the projects I construct, so give me a follow to not miss out. See you in the following article 🙂

ASK DUKE

What are your thoughts on this topic?
Let us know in the comments below.

0 0 votes
Article Rating
guest
0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments

Share this article

Recent posts

0
Would love your thoughts, please comment.x
()
x