Choosing between free threading and async in Python

Software

December 2, 2025

At this year’s EuroPython, Optiver Senior Software Engineer and Team Lead Samet Yaslan delivered a timely talk for developers working on performance-critical systems: “Choosing between free threading and async.”

Samet’s session was sparked by a significant change to CPython. Beginning with version 3.13, CPython introduces an option for a build known as free threading, where the Global Interpreter Lock (GIL) is removed. The question is: With the GIL gone, do we still need async in Python?

Watch the talk or read the write-up to see how Samet breaks down the trade-offs and what they could mean for your next Python project.

In this blog, I’ll guide you through my talk’s key points and explain how to choose the best concurrency model—synchronous, async, or multi-threading—for your Python projects. We’ll discuss the strengths and weaknesses of each approach and when to choose one over the other—whether you’re handling CPU-bound tasks with threads or I/O-bound workloads with async.

With the help of a simple kitchen analogy, you’ll have a clear understanding of which model to use for your specific case, empowering you to make more efficient decisions.

First, what is GIL and why do we have it?

The Global Interpreter Lock (GIL) exists mainly to simplify memory management in CPython by preventing race conditions when updating reference counts, which are used to track object lifetimes. This makes the interpreter easier to implement and often more efficient for single-threaded applications. When threads were first introduced in Python, most systems had only a single CPU, so this limitation wasn’t a major concern. The GIL was a practical trade-off to reduce complexity and ensure stability at a time when true parallel execution wasn’t widely needed.

Over time, however, hardware capabilities have evolved dramatically. Modern machines often come equipped with many CPU cores and leveraging them effectively has become crucial for performance-intensive applications. In this environment, the value of true parallelism—and thus of multi-threading without the constraints of the GIL—has grown significantly.

In version 3.13, CPython introduces an option where the GIL can be disabled. This allows threads to truly run at the same time, fully utilising multiple cores and processors. It dramatically improves multi-threaded performance of the Python interpreter.

To ease the transition, version 3.13 is to be optionally available as a separate build:

3.13t
It’s without the GIL, allowing for free threading.
Developers can choose to use this version if they want.

But how does this affect your choice between synchronous, async, and multi-threading models?

The kitchen analogy: three approaches, three scenarios

Imagine a restaurant kitchen where the goal is to prepare a simple meal—a steak and a salad. We’re going to prepare this meal in three different ways.

1. In kitchen one: A single cook

There’s a single cook responsible for all prep from start to finish— seasoning, grilling, resting and plating the steak, then chopping, mixing and plating the salad. A simple and efficient approach if you only have a few customers.

2. In kitchen two: A multitasking cook

There’s still a lone cook, but one who multitasks due to a high number of customers. The key is managing multiple tasks at the same time by breaking them down into smaller steps and switching between them when there’s idle time—only possible when there’s something else (like the grill) doing some of the work.

3. In kitchen three: Multiple cooks, multiple grills

There are multiple cooks working together, yet independently, to execute tasks in a true parallel nature, with multiple steaks and salads being prepared all at the same time. But here kitchen design is a factor. With only one grill, the kitchen’s efficiency is limited. With multiple grills, multiple cooks can all grill steaks simultaneously and therefore faster.

Now, imagine this restaurant is your Python application.

For decades, Python’s GIL has been like having one grill. No matter how many cooks (threads) you hired, they could only take turns using it. But with Python 3.13’s GIL-free build, it’s like Python’s kitchen just hired more cooks and added more grills. Things can finally happen at the same time! This is multi-threading with free-threading enabled.

But does this mean async programming is going to be obsolete? Do we still need our efficient multitasking cook?

To answer that, we’re going to explore the choice between multi-threading and async programming, and how this decision is influenced by the removal of the GIL, in order to determine which approach is best for your needs.

Sync, async or threads: How to choose the right model?

Synchronous

Synchronous programming is the most straightforward and traditional model. Tasks are executed one at a time, from start to finish, in a strict, sequential order. The program waits for each task to complete before moving on to the next one. This simplicity makes it easy to understand, easy to write, and easy to debug. It’s also well-suited for CPU-bound operations where one CPU core is enough.

Returning to our kitchen, synchronous programming is like the single cook handling every task from start to finish. If you’re preparing a simple meal without interruptions, there’s no need to overcomplicate—just cook one dish at a time. In my opinion, this means synchronous programming should be your default choice especially for simplicity and clarity.

When to use sync:

One CPU is enough
Responsiveness doesn’t matter
Tasks can be executed one after another
Blocking operations is acceptable, like batch processing that’s part of a data pipeline

Trade-offs:

Not efficient if your tasks involve waiting (network requests, file I/O, database queries)
Program remains blocked during waiting times
Quickly becomes a bottleneck in highly responsive scenarios

It’s a solid choice for straightforward, single-threaded operations, but it quickly becomes a bottleneck when high responsiveness is required like a user interface or a busy web server.

Async

Now, let’s look at async programming, a model designed for handling I/O-bound tasks more efficiently. The focus here is on multitasking, not parallelism. It’s about breaking down programs into small pieces where tasks are still executed sequentially. Through rapid context switching, it creates the illusion of parallel execution, and it makes programs more responsive.

Think back to the multitasking cook in our kitchen analogy. If your cook constantly needs to check the grill while also preparing smaller dishes, async programming is your go-to approach. It allows the cook to juggle multiple tasks without extra resources, maintaining responsiveness and efficiency.

When to use async:

When responsiveness matters, but one CPU core is enough.
You’re dealing primarily with I/O-bound tasks like network calls, file operations, or databases.
Your tasks are structured as smaller, independent operations that can interleave efficiently.
You want concurrency without managing thread complexities.

Trade-offs:

The learning curve is steeper due to concepts like event loops and task scheduling. It’s a bit more difficult to grasp what’s going on under the hood.
You need libraries that are compatible with async; you need to re-write your code.
Debugging can be tricky.
Not suitable for CPU-intensive tasks—a single heavy computation can block the event loop.

Despite this, it is a powerful method for high-concurrency scenarios where responsiveness is crucial.

Multi-threading (free threading)

Finally, let’s talk about multi-threading, a model designed to handle CPU-bound tasks at scale. Multi-threading focuses on executing multiple tasks at the same time. Until now, Python’s multi-threading was restricted by the GIL, which means:

Even if you created multiple threads, only one thread could execute Python bytecode at a time, making it inefficient for cases where the program needs to execute Python code
For things like file I/O where operations happen at a low level, multi-threading with GIL can still be effective because GIL will not block this type of I/O operation.
But for cases where the program wants to execute Python bytecodes at the same time, the GIL will be a limitation.

Python 3.13’s GIL-free build (3.13t) removes this limitation, finally enabling true parallelism. If you have a dish that requires intensive preparation—like making a big complex meal from scratch—that requires a lot of work to be done in the kitchen. And you need to prepare many of them at scale. Then adding more cooks is your solution.

Free threading in Python is exactly that: multiple cooks preparing multiple dishes at once, working together not blocking each other.

Use free threading when:

One CPU just isn’t enough
Performing heavy computations like data processing, image manipulation, or mathematical calculations.
You can leverage the GIL-free build of version 3.13 for real multi-threading.
You need true parallelism across multiple cores.

New challenges of multi-threading without the GIL:

Thread safety will be more difficult to achieve. Without the GIL, developers need to carefully handle data access to avoid race conditions.
Library compatibility is another topic. Many existing Python packages are written with the assumption that the GIL provides thread safety.
When GIL is gone most of the available Python packages will have to adapt. It will take time for these libraries to support the free-threading model and ensure compatibility with no-GIL.

The good news is lots of progress has been made already. Especially for the machine learning ecosystem, you’ll see that many libraries already support the free-threading version. See

https://py-free-threading.github.io/tracking/

Despite these challenges, multi-threading without the GIL is a major step forward, providing powerful performance improvements for the CPython interpreter.

Key takeaways

Synchronous programming: Default choice—simple, effective, best for sequential tasks where responsiveness isn’t critical.
Async programming: Perfect for responsive, I/O-bound workloads needing high concurrency, yet limited to a single CPU core.
Free-threading (Multi-threading): Ideal for tasks that demand true parallel execution across multiple CPU cores—now possible with Python 3.13’s GIL-free build.

Python without the GIL is a fantastic new tool, but it doesn’t replace async. It simply expands our options. This flexibility lets you precisely choose the right programming model for your Python project’s specific needs.

Insights Related Articles

Building visibility into a CI platform

SoftwareMay 3, 2026

When people talk about developer productivity, they often jump straight to tools: powerful coding agents, faster compilers, smarter automation. These things matter, but they are not the whole story.

Pushing Postgres beyond storage

SoftwareApril 29, 2026

In most systems, the database acts as a boundary. You write data into it, and other systems read from it. If you need something more dynamic, like reacting to changes as they happen, you usually introduce something alongside it, whether that is a service layer, a queue, or a stream.

Research at scale: Orchestrating ML for systematic trading

SoftwareDecember 18, 2025

Systematic trading rests on the strength of its research engine. Experiments run today to become tomorrow’s models, signals, risk checks, and position limits. End-to-end research turnaround time should be a core operating target for any desk aiming to compete.

Why DevEx matters to Optiver

SoftwareSeptember 9, 2025

Meet Cian Lane, Optiver’s Global Head of Developer Experience. Over the course of five years as a Software Engineer with us, Cian has worked across multiple teams in both our Amsterdam and London offices. Through this cross-regional experience, he’s been able to develop a strong sense of what helps developers to be productive at Optiver, and what slows them down.

Low Latency C++ Systems for Trading with David Gross at CppCon

SoftwareMarch 31, 2025

Achieving ultra-low latency in trading systems starts at the design level. At CppCon, David Gross, Options Tech Lead, delivered a must-watch talk on building high-performance trading systems using C++.

Navigating performance challenges as a C# Software Engineer at Optiver

SoftwareJuly 12, 2023

When we think of market making, we often associate it with low-latency C++ applications. However, many other technologies and programming languages play a crucial role in establishing Optiver as a leading global trading firm, among them being C#.