Wednesday, 29 April 2026

Learn concurrency - a deep dive into multithreading with Python

By Nikos Vaggalis - journalist at i-programmer and software engineer. Read more of his work here

This article explains concurrency in Python including topics like multithreading, multiprocessing, race conditions, and synchronization mechanisms such as locks. We’ll then take a deep dive into switching off GIL to enable real multithreading in Python, highlighting the differences, the benefits and the gotchas with clear code examples.

Introduction

You might be wondering why you’ll need concurrency at all. In most case you won’t; but you will if you are involved with (for example):

  • Data processing and ETL - parsing massive text files, cleaning up messy data, or applying complex regular expressions to millions of rows;
  • Cryptography and hashing - reading files and calculating a cryptographic hash (like SHA-256) for every single one;
  • Data science - running Monte Carlo simulations requiring heavy math operations such as those provided by NumPy, Pandas, and Scikit-Learn; or
  • Network operations - downloading files, scraping web sites or calling REST APIs.

But first, let’s get our terminology straight; concurrency, parallelism and multi-threading. Terms similar in nature which therefore confuse! So let’s look at them in simple terms.

Sequential - a single process doing one thing at a time, and waiting for it to complete before commencing the next process.

Concurrency - a single process managing multiple things at once, but not necessarily doing them simultaneously. e.g. . A chef (the process) chops onions, puts the soup on to boil, switches to seasoning the meat while the soup boils, starts frying the onions, and then cutting the meat while the onions are frying and the soup is boiling

Event Loop - A control structure that continuously waits for events (such as I/O completion, timers, or user actions), dispatches the associated tasks, and then repeats.

Parallelism/Multiprocessing - doing multiple things at the exact same time. e.g. two chefs (processes), one chops onions and the other chops tomatoes at the same time.

Multithreading - a programming model where a single process spawns multiple separate threads of execution, to achieve concurrency. All threads share the exact same memory space and resources. Thinking about it, it is also parallelism but in the same process (sharing memory) rather in separate processes.

Thread safety is the property of a program or system that allows multiple threads to concurrently access and modify shared memory and resources without causing data corruption, memory leaks, or fatal program crashes.

When multiple threads operate within the same memory space, there is a risk that they will attempt to read and write to the same data simultaneously. If this access is not carefully synchronized, it leads to a race condition, meaning the final result depends unpredictably on the exact timing of when each thread executes its tasks.

To achieve thread safety and prevent these race conditions, developers and programming languages rely on various synchronization mechanisms like mutexes and locks and atomic operations.

The Global Interpreter Lock (GIL) - This is a (default) setting of Python which prevents multithreading.

Free-threading - Removing the GIL to enable multithreading in Python.

Multiple CPU core - A CPU core is an individual processing unit within a computer’s processor (CPU) that reads and executes instructions independently. Modern CPUs usually feature multiple CPU cores which enable parallelism.

Example: Parallelism/Multiprocessing

Just run this command in two terminals:

python -c "while True: pass"

(using ctrl-C to stop it when the experiment is over)

If your computer has multiple CPU cores, then each of these separate processes will use a different core. This isn’t a very interesting example of parallel concurrency because there is nothing shared between the two processes.

Example: Non-parallel Concurrency

Let’s step back and find a more practical use of concurrency where a function's code should be run concurrently - in this case, the worker function.

import asyncio

async def worker(name, delay):
    print(f"{name} starting")
    await asyncio.sleep(delay)
    print(f"{name} finished after {delay} seconds")


async def main():
    task1 = asyncio.create_task(worker("task1", 1))
    task2 = asyncio.create_task(worker("task2", 1))
    task3 = asyncio.create_task(worker("task3", 1))

    await task1
    await task2
    await task3

asyncio.run(main())

The asyncio package runs entirely on a single thread, it only has one physical CPU core available to it. It relies on the fact that the tasks involve I/O operations (like waiting for a network, a database, or a timer like asyncio.sleep()). It only works because the CPU is essentially doing nothing during that waiting period, allowing the event loop to feed it another task. Therefore under asyncio when an I/O blocking operation is hit, the code explicitly yields control back to the event loop allowing the loop to efficiently run other threads in the background rather than sitting idle.

If you remove asyncio, then each task would run sequentially, one after the other, taking three times more than the asyncio code.

At this point it’s important to note that the GIL severely restricts CPU-bound code, but it has very little negative impact on I/O-bound code in the example above. CPU-bound tasks are those limited by the raw speed of your processor, such as heavy mathematical computations, image manipulation, or complex data processing. Because the GIL ensures only one thread can execute Python bytecode at a time, it completely prevents CPU-bound threads from running in parallel across multiple CPU cores. Attempting to use multithreading for CPU-bound tasks (as per the standard setup with GIL switched on) can actually degrade your program’s performance. This slowdown occurs because the threads continuously fight for control of the GIL, leading to significant context-switching overhead and resource contention.

Conversely, I/O-bound tasks spend the vast majority of their time waiting for external operations to finish, such as downloading data from the internet, querying a database, or reading and writing files to a hard drive. The GIL does not prevent these operations from occurring concurrently. Whenever a Python thread initiates a blocking I/O operation, it voluntarily releases the GIL. This allows the interpreter to hand the lock over to another thread, which can actively execute Python code while the first thread waits in the background for its data to arrive. Because of this cooperative multitasking, multithreading is a highly effective strategy for speeding up I/O-bound Python programs.

Free-Threaded Concurrency

Free-threaded concurrency is the way we can have multiple threads running concurrently by removing the Global Interpreter Lock (GIL).

By default the GIL prevents multithreading on multiple CPU/multi-core processors, forcing developers to use multiprocessing or C extensions with multithreading for performance-heavy tasks. The removal of the Global Interpreter Lock (GIL) in Python enables true multithreaded parallelism.

The Global Interpreter Lock (GIL)

But let’s take it from the beginning. What is the Global Interpreter Lock, and why is it there?

The Global Interpreter Lock is a mutex (mutual exclusion lock) that allows only one native thread to execute Python bytecode at a time within a single process. Because of the GIL, Python threads cannot achieve true parallel execution on multiple CPU cores; instead, they take turns sharing the processor through cooperative or preemptive multitasking.

The GIL was originally implemented in the early 1990s because it provided a straightforward way to ensure thread safety for Python’s internal memory management (specifically reference counting) and made it much easier to integrate with C libraries that were not thread-safe. By not requiring multiple granular locks for every data structure, the GIL also kept single-threaded programs running extremely fast.

Despite its early benefits, the GIL can now (optionally) be switched off because it has become a massive bottleneck for high-performance computing, artificial intelligence, and machine learning. The primary reasons driving its removal are:

  • Inability to leverage modern hardware. Modern computers rely heavily on multi-core processors, but the GIL prevents CPU-bound Python programs from fully utilizing these cores. When multiple threads try to perform heavy computations in Python, they end up fighting for the GIL. This causes significant context-switching overhead and can actually degrade performance compared to using just a single thread.
  • The severe drawbacks of multiprocessing. To work around the GIL, developers have historically relied on parallelism/multiprocessing—spawning entirely separate system processes instead of threads. However, multiprocessing comes with severe penalties; creating processes consumes far more memory than threads, and sharing data between processes requires expensive data serialization and inter-process communication.
  • Code complexity and C++ rewrites. The inability to easily run parallel threads has forced developers to maintain complex, brittle workarounds. In many cases, organizations are forced to translate large portions of their Python codebases into C or C++ just to achieve the necessary performance.

Through PEP 703, Python has officially introduced an experimental “free-threaded” build starting in Python 3.13, which allows developers to run Python with the GIL disabled. To make this possible without crashing the interpreter, Python’s internals are being completely overhauled with new thread-safe mechanisms. These include a new memory allocator called mimalloc, “immortalizing” certain objects so they don’t require reference counting, and using advanced techniques like biased and deferred reference counting to prevent threads from locking each other up.

Installing the free-threaded Python (i.e Python without the GIL)

Here’s a quick and easy installation in a virtual environment to avoid making changes to your normal setup using the uv package manager.

On MacOS or Linux:

# install uv
curl -LsSf https://astral.sh/uv/install.sh | sh

# logout and login again

# check that uv has been installed
uv --version

# install your free threaded Python in a virtual environment
uv venv --python 3.14t

# activate the virtual environment
source .venv/bin/activate

On Windows:

# install uv
winget install --id=astral-sh.uv  -e
# or if you don't have winget installed:
powershell -ExecutionPolicy ByPass -c `
  "irm https://astral.sh/uv/install.ps1 | iex"

# logout and login again

# check that uv has been installed
uv --version

# install your free threaded Python in a virtual environment
uv venv --python 3.14t

# activate the virtual environment
.venv\Scripts\activate

To check that you have the right Python installed, run:

python -VV

The output should be something like:

Python 3.14.4 free-threading build (main, Apr 14 2026, 14:35:29)

Inside your script, you can call sys._is_gil_enabled(). It will return False if the GIL is turned off.

$ python -c 'import sys; print(sys._is_gil_enabled())'
False

You can switch GIL back on with the -X command-line parameter like this:

$ python -X gil=1 -c 'import sys;
  print(sys._is_gil_enabled())'
True

And you can see whether the version of Python you’ve installed is capable of switching GIL on or off by seeing whether it has the config variable Py_GIL_DISABLED

$ python -c 'import sysconfig;
  print(sysconfig.get_config_var("Py_GIL_DISABLED"))'
1
$ python -X gil=1 -c 'import sysconfig;
  print(sysconfig.get_config_var("Py_GIL_DISABLED"))'
1

Note: If you want all the Python code on the server to be using the free threaded Python, you can do a system level installation following the instructions here.

Liability Waiver

Before we start multi-threading with GIL-free Python, please keep the following caveats in mind:

Race conditions. If you do not carefully synchronize the order in which threads read and write to shared data, the final result will depend entirely on the unpredictable timing of the threads. These race conditions lead to incorrect outputs and are notoriously difficult to reproduce and debug.

Memory leaks and interpreter crashes. Python internally relies on reference counting to manage memory. If multiple threads concurrently increment and decrement an object’s reference count without thread-safe mechanisms, the count can become corrupted, leading to memory leaks or sudden program crashes.

Deadlocks. To prevent race conditions, developers rely on locks. However, this introduces the risk of lock ordering deadlocks. This happens when multiple threads attempt to acquire the same set of locks but in different orders, causing them to freeze indefinitely as they wait on one another to release a lock.

Unsafe iterators. Specifically in Python’s new free-threaded builds, concurrently accessing the same iterator from multiple threads is not thread-safe and may cause your program to duplicate or miss the processing of elements.

C-extension vulnerabilities. For developers using C-extensions, relying on borrowed references (temporarily using an object from a list or dictionary without taking formal ownership of it) is highly dangerous. Without the GIL, a separate thread could delete or modify the object inside the collection while the first thread is still trying to read it. Additionally, improperly sharing raw C-pointers between threads can easily trigger segmentation faults and data corruption. Examples of C-extension packages with which you should be careful of running in free-threaded Python include NumPy, Pandas, and Scikit-Learn.

Number of CPU cores. The examples below are tailored to a computer with 8 CPU cores. To get an accurate measurement of performance of the example scripts below, modify THREADS to match the number of cores on the machine where you’re running them.

Example: Embarrassingly Parallel Multithreading

The great thing about Python’s transition to a GIL-free architecture is that pure Python code (code that does not extend to C libraries) does not need to be modified or rewritten to run on the free-threaded version. The exact same multithreaded code works in both environments. The difference lies entirely in which Python interpreter you use, and the environment flags you set when executing the script.

Here is a standard multithreaded Python example that uses threading.Thread to execute a CPU-bound task (calculating Fibonacci numbers) across four threads:

import threading
import time

THREADS = 8

def fib(n):
    if n <= 1:
        return n
    return fib(n - 1) + fib(n - 2)

def main():
    start_time = time.time()
    threads = []

    # Create and start 8 threads for a CPU-bound task
    for i in range(THREADS):
        thread = threading.Thread(target=fib, args=(35,))
        thread.start()
        threads.append(thread)

    # Wait for all threads to finish
    for thread in threads:
        thread.join()

    print(f"Completed: {time.time() - start_time:.2f} seconds.")

if __name__ == "__main__":
    main()

The benchmarks are eye opening. Running the same code under a free-threaded build yields:

$ python fib.py
Completed: 0.81 seconds.

while with the GIL enabled:

$ python -X gil=1 fib.py
Completed: 4.13 seconds.

Due to the GIL, performance is roughly four times slower. This is because the GIL prevents multiple threads from executing Python bytecode at the same time, hence the threads will continuously interrupt each other and fight for the lock. The program will run serially and utilize only about 12.5% of a 8-core CPU’s capacity, acting exactly like a single-threaded program.

Without the GIL, the CPU cores will immediately run the threads in parallel. You will see a massive drop in execution time on an 8-core machine, because the threads are no longer waiting on a global lock to execute their instructions.

That said, a warning on thread safety. Even though the code syntax doesn’t change, the safety of your existing code might. This is because there’s a common misconception that the GIL had made all Python code inherently thread-safe. In reality, the GIL primarily protected Python’s internal memory management and state, but it did not guarantee that your high-level application operations were atomic or safe from race conditions.

Let’s revisit the Fibonacci code example. The recursive fib(n) function is purely a computationally expensive mathematical operation used to benchmark CPU-bound performance. Each thread executes this function completely independently, relying only on its own isolated call stack and local inputs, and therefore it is thread safe. Because the threads do not modify any shared data or state, it does not require any synchronization.

Synchronization primitives (such as locks) are specifically required to coordinate access to shared resources, preventing race conditions that occur when multiple threads try to read and write to the same memory space simultaneously.

Because the threads in the example above do not exchange data, write to shared global variables, or merge partial results, they do not step on each other’s toes.

When tasks can be performed completely independently of one another without requiring any data exchange, they are often referred to as “embarrassingly parallel”. Since there is no shared state to protect in this scenario, adding a lock is entirely unnecessary.

Example: Parallel Multithreading with Shared Resources - Buggy

Let’s now take a look where things can go awry, using a simple parallel number summarizer.

counter_buggy.py

import threading
import time

COUNT = 1_000_000
THREADS = 8

counter = 0

def worker():
    global counter

    for _ in range(COUNT):
        counter += 1

threads = [
    threading.Thread(target=worker)
    for _ in range(THREADS)
]

start_time = time.time()
for t in threads:
    t.start()

for t in threads:
    t.join()

print("Time:", time.time() - start_time)
print("Counter:", counter)
print("Expected:", COUNT * THREADS)

What the code is doing

We spin 8 threads and each thread increments counter 1,000,000 times:

counter += i

What should happen (correct result) if done safely:

counter = 0
  + 1 000 000 (Thread A)
  + 1 000 000 (Thread B)
  + ...
  + 1 000 000 (Thread H)

  = 8 000 000

The final result should be 8 000 000.

Let’s check that. First of all, with GIL switched on:

$ python -X gil=1 counter_buggy.py
Time: 0.17042112350463867
Counter: 8000000
Expected: 8000000

Now, letting the threads run in parallel:

$ python counter_buggy.py
Time: 0.27542805671691895
Counter: 1105222
Expected: 8000000

The counter is wrong!!

What actually happens (race condition):

The key is that:

counter += 1

is NOT one step. It expands to:

1. Read counter
2. Add 1
3. Write counter back

Interleaving is what causes the bug:

Step-by-step execution:

Initial state: counter = 0
Thread A reads counter : 0
Thread B reads counter : 0   (both read same old value)
Thread A computes 0 + 1 = 1
Thread B computes 0 + 1 = 1
Thread A writes counter = 1
Thread C reads counter = 1
Thread C computes 1 + 1 = 2
Thread C writes counter = 2
Thread B writes counter = 1

We expected:

0 + 1 + 1 + 1 = 3

But instead:

Thread B ignored Thread A’s update

This happens because both threads read before either writes as there’s no synchronization. This is exactly what a race condition is; the outcome depends on the timing (“race”) between threads

Cool, so GIL protects me from those conditions?

The GIL sometimes hides this because in CPython threads don’t run truly in parallel and context switches happen less aggressively, so this exact interleaving happens less often but is still possible.

In no-GIL Python, threads run simultaneously therefore this interleaving becomes common and easy to reproduce.

The mental model to remember is to think of counter += 1 as:

“Read → Compute → Write”

If two threads do that at the same time, they can both read the same old value and one update gets overwritten. Hence the Golden Rule to stick by is:

If multiple threads access shared mutable data and at least one writes, then use synchronization.

Example: Parallel Multithreading with Shared Resources - Simple Fix

With a multi-threaded program updating a shared counter, you need to use synchronization primitives like threading.Lock to protect your application’s shared data and logic, just as you did when the GIL was present. If your code required locks or queues for thread safety before, it will still require them in the free-threaded version. Let’s now check an example:

import threading
import time

COUNT = 1_000_000
THREADS = 8

counter = 0
lock = threading.Lock() # FIX A: create a lock object

def worker():
    global counter

    for _ in range(COUNT):
        # FIX B: lock this block so only 1 thread can run it
        with lock:
            counter += 1

threads = [
    threading.Thread(target=worker)
    for _ in range(THREADS)
]

start_time = time.time()
for t in threads:
    t.start()

for t in threads:
    t.join()

print("Time:", time.time() - start_time)
print("Counter:", counter)
print("Expected:", COUNT * THREADS)

Running it now, you see that the Counter is correct both with GIL switched on and off.

$ python -X gil=1 counter_fixed.py
Time: 0.5505349636077881
Counter: 8000000
Expected: 8000000

$ python counter_fixed.py
Time: 0.8671579360961914
Counter: 8000000
Expected: 8000000

But wait! The GIL free version took almost twice as long to execute it? What’s happening? Where’s the promise of ultra fast speeds? This is completely counterintuitive!

It makes perfect sense to expect that turning off the GIL would instantly make your multi-threaded code run faster. However, the specific code highlights one of the biggest paradoxes in parallel programming; removing the GIL does not magically make serialized code parallel, and it can actually make it worse.

The reason the non-GIL version is nearly three times slower comes down to two main factors, extreme lock contention and CPU cache bouncing.

Here is exactly what is happening under the hood.

1. The Code is Inherently Sequential.

Take a close look at the core loop:

for _ in range(COUNT):
    with lock:
        counter += 1

Because of the with lock: statement, only one thread can ever modify counter at a time. Even if you have 100 CPU cores, 99 of them will be paused, waiting in line for the 1 core that currently holds the lock. There is actually zero parallel work happening in this specific task.

2. Execution with the GIL (not parallel, but fast)

When the GIL is enabled, Python acts as a central traffic cop. The GIL ensures that only one thread executes Python bytecode at any given moment. Because the GIL is already preventing multiple threads from running simultaneously on multiple cores, the contention for lock is actually quite low. Python’s internal thread switching handles the hand-offs relatively smoothly. The threads aren’t fiercely fighting over the lock at the operating system level because the GIL is keeping them somewhat orderly.

3. Execution without the GIL (parallel, but slow)

When you disable the GIL, Python takes the training wheels off and hands control directly to your Operating System. Now, the OS puts your 8 threads onto 8 separate, physical CPU cores and yells “Go!”

All 4 cores instantly rush to acquire lock at the exact same fraction of a millisecond. Because the lock is highly contested (8 million times in total), the operating system has to constantly intervene, putting threads to sleep and waking them up. This OS-level context switching is incredibly heavy and expensive compared to Python’s internal GIL management.

The other reason is of the Cache Bouncing occurring at the hardware level, meaning that the CPU cores spend more time aggressively syncing and invalidating each other’s memory caches (cache bouncing or ping-ponging) than actually doing the math.

Example: Parallel Multithreading with Shared Resources - Proper Fix

Disabling the GIL is not a silver bullet for performance. The “No-GIL” version shines in CPU-bound tasks that do not share state (the “embarrassingly parallel” workloads). If your threads are doing heavy math on independent variables, the No-GIL version will be significantly faster. But if your threads are constantly fighting over a single, shared, locked variable, the No-GIL version will suffer from severe hardware-level traffic jams. As such, the secret to making multi-threaded code fly—especially without the GIL—is to eliminate shared state.

Having looked at this issue, let’s rewrite the code so that it actually leverages multiple cores and runs dramatically faster without the GIL. To do so, instead of having all four threads fight over a single, locked counter, we will give each thread its own local counter. They will do their work completely independently, and we will simply add up their final tallies at the very end.

We’re going to do so by utilizing Python’s concurrent.futures module, which is the cleanest way to handle this:

from concurrent.futures import ThreadPoolExecutor
import time

COUNT = 1_000_000
THREADS = 8

def worker():
    local_counter = 0

    for _ in range(COUNT):
        local_counter += 1

    return local_counter

start_time = time.time()

with ThreadPoolExecutor(max_workers=THREADS) as executor:
    results = list(executor.map(
        lambda _: worker(), range(THREADS)
    ))

counter = sum(results)

print("Time:", time.time() - start_time)
print("Counter:", counter)
print("Expected:", COUNT * THREADS)

Not only is it correct, but it’s about five times as fast as the buggy code!

12:23 $ python counter_fixed_fast.py
Time: 0.05626487731933594
Counter: 8000000
Expected: 8000000

12:23 $ python -X gil=1 counter_fixed_fast.py
Time: 0.10117506980895996
Counter: 8000000
Expected: 8000000

Wow, talking about improvement! Why is this version drastically better? Firstly, because of zero lock contention. There’s no threading.Lock(), (as we used in the quick fix) the operating system never has to pause a thread to wait for another one. In comparison with the buggy code, there’s zero cache bouncing. Each thread is updating a variable (local_counter) that lives in its own isolated CPU cache. The CPU cores don’t have to waste time synchronizing memory with the global variable.

Running this specific code in a “free-threaded” Python environment, is being done in true parallelism and will scale beautifully. The cores will sprint through their individual loops simultaneously, and the execution time will drop significantly.

The Golden Rule of Parallelism

Whenever you are trying to speed up code using multiple cores, always ask yourself: “Do these threads need to talk to each other right now?” If the answer is yes, it will be slow. The best parallel code splits a big job into completely isolated chunks, processes them separately, and merges the results at the finish line.

Most pure Python code will not need to be rewritten or touched. The free-threaded architecture was specifically designed to be highly compatible with existing code. Operations that were considered atomic under the GIL continue to be atomic in the free-threaded version. Therefore, as long as a pure Python library was already properly thread-safe (using standard synchronization tools like threading.Lock where appropriate), it will remain thread-safe without the GIL.

But what about the Python code/libraries that rely on underlying C libraries. Shouldn’t they be re-written in order to become thread safe and be safely used on a disabled GIL Python?

Historically, the Python C API gave developers direct control over the GIL, and many native extensions implicitly relied on the GIL to protect global data structures and object states within their C code. Libraries that rely on native C extensions—which includes most data science, AI, and performance-critical libraries like NumPy, Pandas and scikit-learn — are not yet fully thread-safe.

What happens if a library hasn’t been rewritten yet? Python includes a built-in safety net. If you import a third-party C-API extension package that has not been explicitly marked as supporting free-threading, the Python interpreter will automatically re-enable the GIL at runtime and print a warning. This prevents legacy packages from crashing your program, though it temporarily removes your multi-core performance gains.

The good news is that the transition is already underway. Organizations like Meta and Quansight are actively trying to add free-threading compatibility to the most popular packages in the Python ecosystem, while websites like Python Free-Threading Guide have even been set up to track the compatibility status of popular libraries.

To conclude, GIL free Python truly enhances performance multi-folds, but it’s not a fit-all solution; removing the GIL does not magically make serialized code parallel, and it can actually make it worse. You’ve got to understand how to structure your code to take advantage of it.

Further Exploration

If you want to stretch the capabilities of the new free threading system, on Python Free-Threading Guide there are some outstanding examples where the absence of the GIL works wonders. One particular example doing Web Scraping blends multi-threading with asyncio, two concepts that initially look entirely orthogonal!

“Web scraping is the process of extracting useful data from websites, and it becomes especially challenging and time-consuming when dealing with hundreds or thousands of pages. The traditional synchronous approach scrapes only one page at a time and is slow. With asyncio, we can leverage asynchronous I/O to scrape multiple pages concurrently, which significantly speeds up the process; however, asyncio can only utilize a single CPU core. Modern computers have multiple CPU cores; yet, asyncio only takes advantage of a single core. However, with free-threaded Python, we can run multiple asyncio workers in threads to utilize all available cores.”

Monday, 23 February 2026

Python for Java developers

We’re pleased to welcome our guest blogger, Nikos Vaggalis. Nikos is a journalist at i‑programmer and a seasoned software engineer with extensive hands-on experience across a wide range of technologies. With a career that bridges technical journalism and real-world software development, he brings a uniquely informed perspective - grounded in both industry practice and analytical insight.

You can explore more of Nikos’ published work in this collection of his articles. 

Python. Even if you close your eyes and try to ignore it, you really can’t; it’s everywhere, especially in the AI and Data Science fields. Therefore, it has become a necessity to get to know it, even if you’re a Java developer enjoying its rich and large language ecosystem. Despite the fact that lately Java has been powered up with great AI libraries like LangChain4j, which make it easy to build AI-powered applications, in order to tap into the full, unconstrained pool of AI capabilities, you still need to go to Python.

Of course, to use Python for AI and other advanced use cases, you have to first go through the basics. But no worries, as help is on the way. I had the privilege of becoming a beta tester of Geekuni’s Python Essentials, a brand-new course for intermediate-level programmers who know how to program in a different programming language but are new to Python.

Therefore, this tutorial is based on the course and works by converting a few but crucial selected excerpts from Python to Java. As such, it is addressed to programmers who come from a Java background and want to learn Python, but it works the other way too, that is, for Pythonistas looking at Java. So let’s begin!

In Python, indentation is meaningful

For most languages this is what developers use to help other developers (or themselves in 6 months) easily identify sections of code. In the case of Python, indentation actually defines compound statements in the way that { curly braces } do for languages like C:

if (x > 0) {
    printf("x is positive\n");
    printf("x squared is %d\n", x * x);
}

and begin and end do for Pascal:

if x > 0 then
begin
    writeln('x is positive');
    writeln('x squared is ', x * x);
end;

In Python an if statement looks like:

if x > 0:
    print('x is positive')
    print('x squared is', x**2)

print('We are out of the if statement now')

In Java of course you use braces.

Polymorphism with strings, and introducing the REPL

We’ll find out the various ways of representing strings by using Python’s REPL (Read-Eval-Print Loop) as a hands-on tool for learning these concepts interactively.

Just open a terminal and run python3 at the command-line.

$ python3

You can then type things at the prompt, press ENTER and see any output on the following line. For example:

>>> name = "Zachary"
>>> print("Hello, " + name)
Hello, Zachary

In summary: it Reads what you typed; Evaluates that string (which means, it runs it as a Python command); Prints any output (there was only output in the case of the print statement in this example); and Loops back to reading the next line you type.

Now you can use the REPL to experiment with the two separate concepts we’re covering here. The first is polymorphism - where we see an operator (or function) behaving differently depending on its arguments. For example:

>>> print(2 + 3)
5
>>> print('foo' + 'bar')
foobar

In the first case it’s adding the arguments, and in the second, it’s concatenating them. The action of the operator is determined by the types of its arguments.

Now, look at the * operator between numbers and strings.

For example:

>>> print(2 * 3);
6
>>> print("1" * 5);
11111

Java Equivalent

In Java you don’t normally have a REPL because it is a compiled language… or can you? Yes you can, with JBang and its JShell scripting. JShell, Java’s Read-Eval-Print-Loop (REPL) tool (available since JDK 9), allows programmers to quickly test code snippets, explore APIs, and incrementally construct code, providing immediate feedback.

You can download and install Jbang, or you can upload it straight into your browser through the “JBang powered Jupyter Environment”. No JDK install. No account. No IDE. Just head over to TryJbang which we’re going to use for running our Java code as painlessly as possible.

Here’s the equivalent Java code - paste each block into the JBang editor and click run.

String name = "Zachary";
System.out.println("Hello, " + name );
2 + 3;
"foo" + "bar";
2 * 3;
System.out.println("1".repeat(5));

You’ll notice straight away that in Java you have to declare your variables’ types, such as String name=, which starts life as a String and dies as a String. In contrast, Python is dynamically typed and a variable can be a String or a Number or another data type.

Also notice Python’s string repetition example of:

print("1" * 5);

which in Java ends up as:

System.out.println("1".repeat(5));

That is because Java does not overload the * operator for string repetition (unlike Python), requiring the String.repeat() method for this functionality.

Functions

In this exercise we experiment with implementing a function in Python.

As mentioned earlier, indentation defines compound statements in the way that curly braces do so in languages like C. For example - a one line compound statement:

def square(n):
    return n ** 2
>>> square(3)
9

There are three compound statements in the following example - the if block, the else block and also the def block which contains them:

def is_even(n):
    if n % 2:
        print(f"Sorry, {n} is odd")
        return False
    else:
        print(f"Yay! {n} is even")
        return True
>>> is_even(1)
Sorry, 1 is odd
False
>>> is_even(2)
Yay! 2 is even
True

Here’s an example of a recursive function which calls itself:

import math

def num_squares(n):
    """Return the number of squares up to, and including, n
    """
    if n == 1:
        return 0
    return num_squares(n-1) + (
        1 if math.sqrt(n).is_integer() else 0
    )
>>> num_squares(9)
2

Java Equivalent

In Java, the first example would be implemented as a method within a class, specifying int as the parameter type and the return type:

public class FunctionConversion {

    public static int square(int n) {
        return n * n;
    }

    public static void main(String[] args) {
        int result = square(3);
        System.out.println(result); // Output: 9
    }
}

FunctionConversion.main(null);

The second example. In Java, conditional logic uses parentheses around the condition and { curly braces } to delimit the code block for the if and else statements, while output is handled using System.out.println():

public class FunctionConversion {

    public static boolean isEven(int n) {
        // In Python:
        // - 'if n % 2:' is true if the remainder is non-zero
        if (n % 2 != 0) {
            System.out.println("Sorry, " + n + " is odd");
            return false;
        } else {
            System.out.println("Yay! " + n + " is even");
            return true;
        }
    }

    public static void main(String[] args) {
        isEven(3);
        // Output: Sorry, 3 is odd

        isEven(2);
        // Output: Yay! 2 is even
    }
}

FunctionConversion.main(null);

Third example with recursion. In Java, you use Math.sqrt() to calculate the square root. To check if the result is an integer, you can compare the value to its mathematical floor (Math.floor).

import java.lang.Math;

public class FunctionConversion {

    public static int numSquares(int n) {
        if (n == 1) {
            return 0;
        }
        double root = Math.sqrt(n);

        // Determine if the root is an integer
        // In Python: 'is_integer()'
        int addValue = (root == Math.floor(root)) ? 1 : 0;

        // Recursive call
        return numSquares(n - 1) + addValue;
    }

    public static void main(String[] args) {
        int result = numSquares(9);
        System.out.println(result); // Output: 2
    }
}

FunctionConversion.main(null);

Looking at the Python and Java examples side by side, you can’t help but think that Java’s solutions are more verbose than Python’s.

Making a list with the range function

In this exercise we learn about Python’s range class and how we can use it to make a list.

Let’s start by creating a range object:

>>> range(10)
range(0, 10)

This object represents the integers matching x: 0 <= x < 10, namely 0, 1, ... 9.

But how do you get a list of the actual numbers?

nums = []
for x in range(10):
    nums.append(x)
>>> nums
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

The code above introduced you to Python’s append method of a list. It works, but it’s a bit clunky.

Happily, this can be done in one line using the list constructor which takes (in this case) the range object as an argument and returns a list of its elements.

>>> list(range(10))
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

Java Equivalent

In Java, the first example requires defining a for loop with explicit initialization, termination condition, and increment, and using a generic ArrayList to hold the resulting numbers.

import java.util.ArrayList;
import java.util.List;

public class RangeConversion {

    public static void main(String[] args) {
        // In Python: nums = []
        List<Integer> nums = new ArrayList<>();

        // In Python: range(10)
        for (int x = 0; x < 10; x++) {
            // In Python: .append(x))
            nums.add(x);
        }

        // Print the result to the console
        System.out.println(nums);

    }
}

RangeConversion.main(null);

Or in modern Java, the second example of Python’s list(range(10)) becomes:

import java.util.List;
import java.util.stream.Collectors;
import java.util.stream.IntStream;

public class RangeConversion {
    public static void main(String[] args) {
        // Creates a List<Integer> of 0 through to 9
        List<Integer> list = IntStream.range(0, 10)
                             .boxed()
                             .collect(Collectors.toList());

        System.out.println(list);
    }
}

RangeConversion.main(null);

Lambdas (of course)

In this exercise we learn how to use lambda functions, a special type of function which is more concise for simple tasks.

First of all, here’s an example of a lambda function which takes a variable x, and returns x + 1:

lambda x: x + 1

and here’s the same function being called:

>>> (lambda x: x + 1)(2)
3

And this is the same function being assigned to a variable, and then called:

>>> f = lambda x: x + 1
>>> f(2)
3

When you run it like that, you won’t be surprised that the lambda function with the code lambda x: x + 1 has the same functionality as the function:

def increment(x):
    return x + 1

with the main difference to the caller being the name:

>>> f.__name__
'<lambda>'
>>> increment.__name__
'increment'

The main difference between writing a lambda to a function is that a lambda can only contain a single expression. What’s an expression?

For the statement:

x = 2*(3 + 5)

2*(3 + 5) is the expression.

Python’s lambdas use the programming pattern which comes from lambda calculus - namely functional programming.

What’s functional programming? Unlike procedural programming, where a block of code is a sequence of statements, including assignments which are creating variables or changing their values:

x = 1
y = 2
x += y

code in functional programming is just an expression - the code you can see on the right hand side of the = assignment operator.

Java Equivalent

The Python lambda function conversion in Java is achieved using Lambda Expressions, a feature introduced in Java 8 to support functional programming. In Java, a lambda expression is a concise way to create an instance of a functional interface (an interface with exactly one abstract method).

So this requires defining a functional interface to serve as the target type for the lambda. We can define a simple interface, or use a built-in one like IntUnaryOperator (for operations on a single integer operand that return an integer result):

import java.util.function.IntUnaryOperator;

public class LambdaConversion {

    public static void main(String[] args) {
        // In Python: f = lambda x: x + 1
        IntUnaryOperator increment = (x) -> x + 1;

        // In Python: (lambda x: x + 1)(2)
        int result = increment.applyAsInt(2);

        System.out.println(result); // Output: 3
    }
}

LambdaConversion.main(null)

Key Differences and Java Concepts:

  1. Context: Java code must reside within a class.

  2. Anonymous Nature: Just like Python’s lambda, the Java lambda expression itself is an anonymous method, but it must be assigned to an instance of a functional interface (IntUnaryOperator in this case).

  3. Return Type: In both Python and Java, if the body of the lambda is a single expression, the result of that expression is implicitly returned without needing a return keyword.

Finally - the map operator

Our final example will demonstrate functional programming in action by building on the Lambdas section above.

The map operator takes two arguments - a function and an iterable, and returns an iterator whose elements are the result of applying the function to the elements of the iterable.

In this exercise we’re playing with the capitalize method of the string class:

>>> 'please check for typos!'.capitalize()
'Please check for typos!'

Here’s a function which takes a string and returns the capitalised version of it:

def cap(word):
    return word.capitalize()

Now we can use map to apply this function to a list of words:

>>> uc_words = map(cap, ['this', 'is', 'a', 'test'])
>>> list(uc_words)
['This', 'Is', 'A', 'Test']

Note that the function can also be a lambda:

>>> uc_words = map(
...     lambda w: w.capitalize(), ['this', 'is', 'a', 'test']
... )
>>> list(uc_words)
['This', 'Is', 'A', 'Test']

The Python map() function converts easily to Java using Streams and Lambda Expressions.

Java Equivalent

In Java the functionality of applying a function to every element of an iterable and collecting the results is typically accomplished using the stream().map().collect() pipeline.

So the Java Equivalent of the first example would use the Stream API to perform the transformation:

import java.util.Arrays;
import java.util.List;
import java.util.stream.Collectors;

public class MapConversion {

    // Define the capitalization logic as a dedicated method
    // (or use a lambda directly)
    // Note: Java's standard capitalize methods
    // typically handle the first letter only.
    // Mimicking the behavior of Python's .capitalize().
    public static String cap(String word) {
        if (word == null || word.isEmpty()) {
            return word;
        }
        // In Python: word.capitalize()
        // Capitalize first letter, lowercase the rest
        return word.substring(0, 1).toUpperCase()
            + word.substring(1).toLowerCase();
    }

    public static void main(String[] args) {
        List<String> words = Arrays.asList(
            "this", "is", "a", "test"
        );

        // 1. Create a Stream from the list.
        // 2. Use the map to apply 'cap' to each element.
        // 3. Use Collectors.toList() to generate a List.
        List<String> ucWords = words.stream()
            .map(MapConversion::cap)
            .collect(Collectors.toList());

        System.out.println(ucWords);
    }
}

MapConversion.main(null) // Output: [This, Is, A, Test]

Now using the same Lambda Expressions, which Java can use directly in the map operation:

import java.util.Arrays;
import java.util.List;
import java.util.stream.Collectors;

public class LambdaMapConversion {

    public static String capitalizeWord(String w) {
        // In Python: w.capitalize()
        // Capitalize first letter, lowercase the rest
        if (w == null || w.isEmpty()) return w;
        return w.substring(0, 1).toUpperCase()
            + w.substring(1).toLowerCase();
    }

    public static void main(String[] args) {
        List<String> words = Arrays.asList(
            "this", "is", "a", "test"
        );

        // The lambda w -> capitalizeWord(w) is the function,
        // is the functional interface required by stream.map()
        List<String> ucWords = words.stream()
            .map(w -> capitalizeWord(w))
            .collect(Collectors.toList());

        System.out.println(ucWords);
    }
}

LambdaMapConversion.main(null) // Output: [This, Is, A, Test]

The End

So in the end, which code is better? Python or Java? I’ll leave the answer up to you. But before answering, take into consideration that with jbang we simplified the syntax of the Java code which saved a few lines and made it cleaner. We also didn’t need to set up a JDK or an IDE on our PC. This workflow modernizes Java and brings it up to par with the facilities other (even dynamic languages) like Python enjoy.

This caps the tutorial. As already mentioned, all the Python examples used come from Geekuni’s Python Essentials course and represent just a fraction of the available material, material which encompasses all aspects of the language.