PaRaD1SE

Improving Performance with Multi-Threading in Python

Published: 2023/3/9 Views: 1887

Categories: 

Development

Learn how to boost the performance of your Python programs using multi-threading.

Usage of "for" Loop

The for loop is the most commonly used loop statement in Python, and its syntax is as follows:


The Problem

The issue with the for loop is that it executes sequentially, which means that the next loop will only be executed after the code in the for loop has finished executing. If the code inside the for loop takes a long time to execute, then the for loop will take a long time to execute as well.

When web scraping, we often encounter situations where we need to make multiple requests to an API by changing an index parameter, and then concatenate the data for analysis.


However, network access has always been a bottleneck in our program execution speed. If we use a for loop to get data, we will waste a lot of time.

Enabling Multi-threading

Fortunately, Python provides support for multi-threading, which allows us to open multiple threads and simultaneously access multiple URLs for data retrieval, thereby saving a lot of time. Using the following syntax, we can perfectly replace the for loop.


Principles and Considerations for Speeding up with Multi-threading

It should be noted that the reason why multi-threading can improve the execution efficiency in this example is not because we are squeezing the performance of our computer, but because the CPU is actually idle when downloading data, and it is waiting for the download task to complete before executing the contents of the next loop (starting the next download). In short, what slows down our execution speed is the network, not the CPU. Therefore, if your task is mainly local computing tasks, using multi-threading may not garanteed to improve your execution speed.

In addition, making a huge amount of requests to an API at the same time is not a good idea. We can control the maximum number of threads to avoid putting too much pressure on the internet server.

Tags:

Development
Python
Multi-threading

Previous

Next