In a recent post, we discussed how to run Python about 91X faster than its usual speed. In this post, I aim to discuss an unanswered question from the previous one.
How do the solutions provided perform compared to Python’s traditional performance fix technique — Cython?
Although Python is a fantastic programming language, its performance was its major drawback. The elegant language with straightforward syntax was not made for faster computing.
Cython has been bridging this gap for many years by converting Python code into compiled C programs. A range of Scientific computing packages relies on Cython to speed up computation.
Let’s compare its performance with its modern alternative.
We’ll start by counting prime numbers using plain Python. Then, we’ll compare it with its Cython version. We’ll repeat them using Python’s multiprocessing module to find out its benefits.
Finally, we’ll compare the performances with modern ways to speed up Python programs.
Cython .vs Python for simple calculations.
The most straightforward way to grasp the benefits of Cython is to use it in a rudimentary calculation. Here we use the same measure I used in the previous benchmark — Counting the number of prime numbers below a number.
Computing using Python.
The following code will count the number of prime numbers below 35,000 and print a detailed performance report.
When running the above Python code snippet, the output may look like the one below. The numbers may differ according to your PC specs.
Because there could be many factors affecting the performance, it’s wise to do the experiment several times and use the average for comparison. The average for the above execution in my PC is 18.721 seconds.
Computing using Cython.
Using Cython is different from using Python. First, we need to convert the Python script into C.
- Install Cython if it’s not already installed.
2. Create a file
count_prime.pyx with the below content. It’s the same as before.
3. Create another file,
setup.py , in the same directory with the below content. It will compile your code into a C program.
4. Now, running the following command in a terminal window will create the C versions of our prime number counter function.
5. Also, rename (or move) the pyx file because it may cause import conflicts otherwise.
6. Finally, import the function into our main app and run it the same way we’d call any other Python function.
The average time to run this program is 11.210 seconds. Here is a sample cProfile report from one of the ten trials I did.
A reduction from 18 seconds to 11 is a remarkable improvement. Cython does its job as expected. Let’s repeat the steps for the multiprocessing scenario as well.
Cython .vs Python on multiprocessing.
The multiprocessing module in Python is an excellent way to start many subprocesses. Each process will run an instance of your code. If the processor has idling cores, they can run in parallel too.
The following code will compute prime numbers below 20K, 25K, 30K, 35K, and 40K in multiple processes (parallel if possible.)
Running the above code will result in an average time of 29.7 seconds. The outputs may look like the below.
The below code will repeat the same process as the C version we compiled earlier.
The average time for this is around 18 seconds, and the output may look like the below.
Once again, Cython has proved effective in speeding up Python processing times significantly. It has cut down the time taken to run the entire loop by 37.9%.
High-performance computing with Tuplex.
We’ve seen spectacular success in using Cython. In simple prime number counting and multiprocessing application, it reduces execution time by a massive margin.
Could this be faster? Does the Tuplex library we discussed in the previous post perform better than the Cython versions?
If you’re looking for a detailed post on Tuplex, here is one.
Related Post:How to Speed up Python Data Pipelines up to 91X?
Tuplex for a single process.
Tuplex converts Python scripts into LLVM bytecodes and runs them in parallel. The result is a remarkable improvement in the performance. This new library focuses on data pipelines, but its applications are not limited to data science only.
The below code will run a single computation using Tuplex. Because we pass only a single number in the list argument of the parallelizing method, Tuplex will run it in a single thread.
The tuplex method results in an average of 9 seconds. Note that we are not considering the time Tuplex takes to set up the context.
That’s impressive compared to the 11 seconds of Cython. The results may also vary according to the computer’s architecture. But this number is decent enough to show that Cython has a worthy opponent.
Tuplex on parallel computation.
One last comparison is pending that could be the central purpose of Tuplex — parallelizing. Let’s compute the same multiprocessing exercise with Tuplex. Tweaking the last line of the Tuplex single process example will do the trick. Here’s how it should look.
The result of running the Tuplex code in parallel is an astonishing 10 seconds on average. To finish the same task, Cython took 18 seconds, and plain Python took 29.
Cython had been the primary (and only) performance fix available for Python programmers. The library does well often; thus, many other scientific computing packages use Cython internally.
Yet, a relatively new library, Tuplex, is performing in a promising way. In this article, we did a prime number counting experiment to compare Tuplex with Cython versions. There seems to be a massive reduction in the execution time as well.
But the results may vary based on your hardware, OS, and system state. Also, since Tuplex is not yet production-ready, you must use it with caution. Cython, on the other hand, is stable, and many apps in production are already using it.
The whole point of this comparison is not to conclude one is better than the other. Tuplex has a long journey ahead to come to this point. Yet, If Cython does not fit your project, it’s not a dead end. Not anymore.
Not a Medium member yet? Please use this link to become a member because I earn a commission for referring at no extra cost for you.