If your NumPy-based code is too slow, you can sometimes use Numba to speed it up. Numba is a compiled language that uses the same syntax as Python, and it compiles at runtime, so it’s very easy to write. And because it re-implements a large part of the NumPy APIs, it can also eas … | Continue reading
When it comes to fighting climate change, I strongly believe that getting involved in politics is one of the most useful things you can do. But given how energy-intensive software is these days, writing more efficient software also seems worth doing, especially if your software i … | Continue reading
Climate change is impacting the whole planet, and getting worse every year. So you want to do something—but you’re not sure what. If you do some research you might encounter an essay by Bret Victor—What can a technologist do about climate change? There’s a whole pile of good idea … | Continue reading
If you’re doing computations on a GPU, NVIDIA is the default, alongside its CUDA libraries. Some libraries like PyTorch support do support AMD GPUs and Macs. But from the re-implementations of NumPy, SciPy, and Pandas in the RAPIDS project, to Numba’s GPU support, NVIDIA has best … | Continue reading
pre { font-size: 90% !important; } If you’re writing numeric Python code, Numba can be a great way to speed up your program. By compiling a subset of Python to machine code, Numba lets you write for loops and other constructs that would be too slow in normal Python. In other w … | Continue reading
Do you use NumPy, Pandas, or scikit-learn and want to get faster results? Nvidia has created GPU-based replacements for each of these with the shared promise of extra speed. For example, if you visit the front page of NVidia’s RAPIDS project, you’ll see benchmarks showing cuDF, a … | Continue reading
If you’re writing scientific or data science code with Python, there’s a good chance you’re using NumPy, directly or indirectly. Pandas, Scikit-Image, SciPy, Scikit-Learn, AstroPy… these and many other packages depend on NumPy. NumPy 2 is a new major release, with a release candi … | Continue reading
When you’re running a CPU-intensive parallel program, you often want to have a thread or process pool sized by the number of CPU cores on your machine. Fewer threads and you’re not taking advantage of all the cores, more than that and your program will start running slower as mul … | Continue reading
Polars is a dataframe-based library that can be faster, more memory efficient, and often simpler to use than Pandas. It’s also much newer, and correspondingly less popular. In November 2023: Polars had ~2.6 million downloads from PyPI. Pandas had ~140 million downloads! Becau … | Continue reading
When you’re doing large scale data processing with Python, threads are a good way to achieve parallelism. This is especially true if you’re doing numeric processing, where the global interpreter lock (GIL) is typically not an issue. And if you’re using threading, thread pools are … | Continue reading
Python 3.12 is out now–but should you switch to it immediately? And if you shouldn’t upgrade just yet, when should you? Immediately after the release, you may not want to upgrade just yet. But from December 2023 and onwards, upgrading is definitely worth trying. To understand why … | Continue reading
Cython allows you to write compiled extensions for Python, by translating Python-y code to C or C++. Often you’ll use it to speed up your software, and it’s especially useful for implementing small data science or scientific computing algorithms. But what happens when Cython is t … | Continue reading
pre { white-space: pre; overflow-x: auto; font-size: 80%; } The common advice when Python is too slow is to switch to a low-level compiled language. But what do you do if that code is too slow? Almost always there’s still plenty of performance improvements you can get just … | Continue reading
If you want to speed up some existing Python code, writing a compiled extension in Rust can be an excellent choice: In many situations, Rust code can run much faster than Python. Rust prevents most of the memory-management bugs that often occur in C, C++, and Cython code. The … | Continue reading
If you’re doing numeric calculations, NumPy is a lot faster than than plain Python—but sometimes that’s not enough. What should you do when your NumPy-based code is too slow? Your first thought might be parallelism, but that should probably be the last thing you consider. There a … | Continue reading
When you need to speed up your NumPy processing—or just reduce your memory usage—the Numba just-in-time compiler is a great tool. It lets you write Python code that gets compiled at runtime to machine code, allowing you to get the kind of speed improvements you’d get from languag … | Continue reading
Before you can process your data with Pandas, you need to load it (from disk or remote storage). There are plenty of data formats supported by Pandas, from CSV, to JSON, to Parquet, and many others as well. Which should you use? You don’t want loading the data to be slow, or us … | Continue reading
You’re on a new version of Linux, you try a pip install, and it errors out, talking about “externally managed environments” and “PEP 668”. What’s going on? How do you solve this? Let’s see: What the problem looks like, and what causes it. The places you are likely to encounter … | Continue reading
Flake8 and PyLint are commonly used, and very useful, linting tools: they can help you find potential bugs and other problems with your code, aka “lints”. But they can also be slow. And even if they’re fast on your computer, they may still be slow in your CI system (GitHub Action … | Continue reading
Initial data analysis (IDA) has different goals than your final, production data analysis: With IDA you need to examine the initial data and intermediate results, check your assumptions, and try different approaches. Exploratory data analysis has similar requirements. Once you … | Continue reading
When building Docker images, caching lets you speed up rebuilding images. But this has a downside: it can keep you from installing security updates from your base Linux distribution. If you cache the image layer that includes the security update… you’re not getting new security u … | Continue reading
If you’re doing text or string manipulation in Python, what do you do if your code is too slow? Assuming your algorithm is reasonably efficient, the next step is to try faster alternatives to Python: a compiled extension. Unfortunately, this is harder than it seems. Some options … | Continue reading
Because Python has limited parallelism when using threads, using worker processes is a common way to take advantage of multiple CPU cores. The multiprocessing module is built-in to the standard library, so it’s frequently used for this purpose. But while multiple processes let yo … | Continue reading
You have a file with data you want to process with Pandas, and you want to make sure you won’t run out of memory. How do you estimate memory usage given the file size? At times you may see estimates like these: “Have 5 to 10 times as much RAM as the size of your dataset”, or “ … | Continue reading
Libraries like NumPy and Pandas let you switch data types, which allows you to reduce memory usage. Switching from numpy.float64 (“double-precision” or 64-bit floats) to numpy.float32 (“single-precision” or 32-bit floats) cuts memory usage in half. But it does so at a cost: float … | Continue reading
Python 3.11 has been released—when should you switch to using it? | Continue reading
Your data processing jobs are fast… most of the time. Next, find the slow runs so you can speed them up. | Continue reading
Performance bottlenecks causes vary widely, from network latency to software bugs. Observation in production may therefore be the only way to find them. | Continue reading
New Macs can break your Docker image build in unexpected ways; learn why, and how to fix it. | Continue reading
Vectorization in Pandas can make your code faster—except when it will make your code slower. | Continue reading
Installing packages with pip, Poetry, and Pipenv can be slow. Learn how to ensure it’s not even slower, and a potential speed-up. | Continue reading
msgspec is a schema-based JSON encoder/decoder, which allows you to process large files with lower memory and CPU usage. | Continue reading
Python’s Global Interpreter Lock (GIL) stops threads from running in parallel or concurrently. Learn how to determine impact of the GIL on your code. | Continue reading
Throwing hardware at a software performance is often an easy solution, and sometimes the right one. Learn how to approach the decision, and some alternatives. | Continue reading
Loading complete JSON files into Python can use too much memory, leading to slowness or crashes. The solution: process JSON data one chunk at a time. | Continue reading
Python-based calculations, especially those that use NumPy, can run much faster by using the Numba library. | Continue reading
Learn the fastest way to read a CSV in to Pandas. | Continue reading
Python 3.6 will stop getting security updates in December 2021. Given the existence of 3.7, 3.8, 3.9, and 3.10, you really should upgrade. | Continue reading
Flamegraphs are a great way to visualize performance and memory bottlenecks, but with a little tweaking, you can make them even more useful. | Continue reading
When is it worth the money to buy a product that will help you with your job? Learn how to decide, and how to convince your boss to approve the purchase. | Continue reading
Conda installs are very slow, but you can speed them with a much-faster Conda reimplementation called Mamba. | Continue reading
You can write Python extensions with Cython, Rust, and many other tools. Learn which one you should use, depending on your particular needs. | Continue reading
Python has two packaging systems, pip and Conda. Learn the differences between them so you can pick the right one for you. | Continue reading
Python 3.10 is out now—when should you start using it? | Continue reading
If you’re using GitLab CI to build your software, you might also want to use it to build Docker images of your application. This can be a little tricky, because by default GitLab CI runs jobs inside Docker containers. The standard technique for getting around this problem is usin … | Continue reading
BuildKit is a new and improved tool for building Docker images: it’s faster, has critical features missing from traditional Dockerfiles like build secrets, plus additionally useful features like cache mounting. So if you’re building Docker images, using BuildKit is in general a g … | Continue reading
If you’re using Python’s NumPy library, it’s usually because you’re processing large arrays that use plenty of memory. To reduce your memory usage, chances are you want to minimize unnecessary copying, NumPy has a built-in feature that does this transparently, in many common case … | Continue reading