Fastest Way to Read Excel in Python

I don't have any data to support this next claim, but I'm fairly sure that Excel is the most common way to store, manipulate, and yes(!), even pass data around. This is why it's not uncommon to find yourself reading Excel in Python. I recently needed to, so I tested and benchmark … | Continue reading


@hakibenita.com | 4 months ago

When Good Correlation is Not Enough

Choosing to use a block range index (BRIN) to query a field with high correlation is a no-brainer for the optimizer. The small size of the index and the field's correlation makes BRIN an ideal choice. However, a recent event taught us that correlation can be misleading. Under som … | Continue reading


@hakibenita.com | 9 months ago

Future Proofing SQL with Carefully Placed Errors

Backward compatibility is straightforward. You have full control over new code and you have full knowledge of past data and APIs. Forward compatibility is more challenging. You have full control over new code, but you don't know how data is going to change in the future, and what … | Continue reading


@hakibenita.com | 1 year ago

Handling Concurrency Without Locks – Haki Benita

How to not let concurrency cripple your system | Continue reading


@hakibenita.com | 1 year ago

2021 Year in Review

Painters like their paintings to be viewed, musicians like their music to be heard, and writers like their articles to be read. This is why every year I look back at what I've done: This year I published 6 articles. 5 articles were about SQL, and 2 were also about Python and Dja … | Continue reading


@hakibenita.com | 2 years ago

Lesser Known PostgreSQL Features

Features you already have but may not know about! | Continue reading


@hakibenita.com | 2 years ago

One Database Transaction Too Many

How I told hundreds of users they got paid when they didn't! | Continue reading


@hakibenita.com | 2 years ago

Practical SQL for Data Analysis

What you can do without Pandas | Continue reading


@hakibenita.com | 3 years ago

Exciting New Features in Django 3.2

Django 3.2 is just around the corner and it's packed with new features. Django versions are usually not that exciting (it's a good thing!), but this time many features were added to the ORM, so I find it especially interesting! This is a list of my favorite features in Django 3.2 … | Continue reading


@hakibenita.com | 3 years ago

How to Get the First or Last Value in a Group Using Group by in SQL

A neat little trick using arrays in PostgreSQL | Continue reading


@hakibenita.com | 3 years ago

The Unexpected Find That Freed 20GB of Unused Index Space in PostgreSQL

How to free space without dropping indexes or deleting data | Continue reading


@hakibenita.com | 3 years ago

Re-Introducing Hash Indexes in PostgreSQL

The Ugly Duckling of index types | Continue reading


@hakibenita.com | 3 years ago

2020 Year in Review

The year 2020 has been a turbulent year, but it was also a year of personal and professional growth: This year I published 11 articles. This is consistent with my personal goal of publishing every month. 7 of these articles were about Python, and the other 4 about SQL. Accordin … | Continue reading


@hakibenita.com | 3 years ago

Exhaustiveness Checking in Python Using Mypy

Fail at compile time, not at run time | Continue reading


@hakibenita.com | 3 years ago

The Surprising Impact of Medium-Size Texts on PostgreSQL Performance

Why TOAST is the best thing since sliced bread | Continue reading


@hakibenita.com | 3 years ago

Simple Anomaly Detection Using Plain SQL

Identify Problems Before They Become Disasters | Continue reading


@hakibenita.com | 3 years ago

Some SQL Tricks of an Application DBA

Non-trivial tips for database development | Continue reading


@hakibenita.com | 3 years ago

Stop Using datetime.now!

One of my favorite job interview questions is this: Write a function that returns tomorrow's date This looks innocent enough for someone to suggest this as a solution: import datetime def tomorrow() -> datetime.date: return datetime.date.today() + datetime.timedelta(days=1) Th … | Continue reading


@hakibenita.com | 3 years ago

How to Move a Django Model to Another App

In my latest article for RealPython I share three ways to tackle one of the most challenging tasks involving Django migrations: moving a model from one Django app to another. The article covers some exotic migration operations and many of the built-in migration CLI commands such … | Continue reading


@hakibenita.com | 4 years ago

Testing an Interactive Voice Response System With Python and Pytest

Following my previous article on how to build an Interactive Voice Response (IVR) system with Twilio, Python and Django, in this follow-up tutorial I show how to write automated tests for this system. It can be very challenging to test a system that rely heavily on a third party … | Continue reading


@hakibenita.com | 4 years ago

How to Provide Test Fixtures for Django Models in Pytest

One of the most challenging aspects of writing good tests is maintaining test fixtures. Good test fixtures motivate developers to write better tests, and bad fixtures can cripple a system to a point where developers fear and avoid them all together. The key to maintaining good fi … | Continue reading


@hakibenita.com | 4 years ago

Using Markdown in Django

As developers, we rely on static analysis tools to check, lint and transform our code. We use these tools to help us be more productive and produce better code. However, when we write content using markdown the tools at our disposal are scarce. In this article we describe how we … | Continue reading


@hakibenita.com | 4 years ago

Things You Must Know About Django Admin as Your App Gets Bigger

From Zero to Hero With Django Admin | Continue reading


@hakibenita.com | 4 years ago

Building an IVR System with Python, Django and Twilio

Last year my team and I worked on a very challenging IVR system. After almost a year in production and thousands of processed transactions, I teamed up with the great people over at the Twilio blog to write an introductory tutorial for developing IVR systems using Django and Twil … | Continue reading


@hakibenita.com | 4 years ago

Understand Group by in Django with SQL

Aggregation is a source of confusion in any type of ORM and Django is no different. The documentation provides a variety of examples and cheat-sheets that demonstrate how to group and aggregate data using the ORM, but I decided to approach this from a different angle. In this art … | Continue reading


@hakibenita.com | 4 years ago

Common Mistakes and Missed Optimization Opportunities in SQL

Made by Developers and Non-Developers | Continue reading


@hakibenita.com | 4 years ago

Preventing SQL Injection Attacks With Python

SQL injection are constantly ranked among the most common attacks against systems. For this reason, ORM's offer many ways of dealing with injections. A common solution is bind variables, a placeholder in the query that is sanitized by the ORM for safe execution in the database. H … | Continue reading


@hakibenita.com | 4 years ago

How "Export to Excel" Almost Killed Our System

A few weeks ago we had some trouble with an "Export to Excel" functionality in one of our systems. In the process of resolving this issue, we made some interesting discoveries and came up with original solutions. This article is inspired by the actual issue we used to track this … | Continue reading


@hakibenita.com | 4 years ago

What You Need to Know to Manage Users in Django Admin

Have you ever stopped to think what your staff users can do in Django admin? Did you know staff users with misconfigured permissions on the user model can make themselves superusers? Permissive permissions to staff users can cause disastrous human errors at best, and lead to majo … | Continue reading


@hakibenita.com | 4 years ago

Solving a Storage Problem in PostgreSQL Without Adding a Single Byte of Storage

A short story about a storage-heavy query and the silver bullet that solved the issue | Continue reading


@hakibenita.com | 4 years ago

Fastest Way to Load Data into PostgreSQL Using Python

From two minutes to less than half a second! | Continue reading


@hakibenita.com | 4 years ago

Improve Serialization Performance in Django Rest Framework Including Banchmarks

How we reduced serialization time by 99%! | Continue reading


@hakibenita.com | 4 years ago

How to Let Google Know of Other Languages in Your Django Site

If you have a public facing Django site in multiple languages, you probably want to let Google and other search engines know about it. Linguistic map of the world (source) Multi-Language Django Site Django has a very extensive framework to serve sites in multiple languages. The l … | Continue reading


@hakibenita.com | 5 years ago

How to Create an Index in Django Without Downtime

If you maintain a Django site with a decent traffic, you probably need to deal with graceful migrations. With the help of the RealPython team, I wrote an article about one the most common problems in Django migrations: How to create an index without causing downtime. In the artic … | Continue reading


@hakibenita.com | 5 years ago

How to Use Grouping Sets in Django

I recently had the pleasure of optimizing an old dashboard. The solution we came up with required some advanced SQL that Django does not support out of the box. In this article I present the solution, how we got to it, and a word of caution. advanced SQL This article covers advan … | Continue reading


@hakibenita.com | 5 years ago

It's Time to Own My Own Content

I started writing about two years ago. Back then, I used to read a lot on Medium. When I finally felt the urge to write something, it made sense to publish there as well. Medium provided me with a platform, an audience, and constant reinforcements in the form of stats, likes and … | Continue reading


@hakibenita.com | 5 years ago

Modeling Polymorphism in Django

If you ever added a type, kind or a mode field to a Django model, you probably had to deal with polymorphism at some level. With the great people over at RealPython, I wrote about 5 ways to model polymorphism in Django. Read "Modeling Polymorphism in Django With Python" on RealPy … | Continue reading


@hakibenita.com | 5 years ago

Be Careful With CTE in PostgreSQL

Common table expressions (CTE), also known as the WITH clause, are a very useful feature. They help break down big queries into smaller pieces which makes it easier to read and understand. PostgreSQL Version This article is intended for PostgreSQL versions 11 and prior. Starting … | Continue reading


@hakibenita.com | 5 years ago

Automating the Boring Stuff in Django Using the Check Framework

Every team has a unique development style. Some teams implement localization and require translations. Some teams are more sensitive to database issues and require more careful handling of indexes and constraints. Existing tools can not always address these specific issues out of … | Continue reading


@hakibenita.com | 5 years ago

9 Django Tips for Working with Databases

ORMs offer great utility for developers but abstracting access to the database has its costs. Developers who are willing to poke around the database and change some defaults often find that great improvements can be made. Aggregation with Filter Prior to Django 2.0 if we wanted t … | Continue reading


@hakibenita.com | 6 years ago

How to Add a Text Filter to Django Admin

When creating a new Django Admin page a common conversation between the developer and the support personal might sound like this: Developer: Hey, I'm adding a new admin page for transactions. Can you tell me how you want to search for transactions? Support: Sure, I usually just s … | Continue reading


@hakibenita.com | 6 years ago

Django Admin Range-Based Date Hierarchy

A few weeks ago we encountered a major performance regression in one of our main admin pages. The page took more than 10 seconds to load (at best) and hit the query execution timeout at worst. The page was an admin list view of a transactions model, one of the main models in our … | Continue reading


@hakibenita.com | 6 years ago

Scaling Django Admin Date Hierarchy

package We published a package called django-admin-lightweight-date-hierarchy which overrides Django Admin date_hierarchy template tag and eliminates all database queries from it.For the implementation details and the shocking performance analysis read on. If you are not familia … | Continue reading


@hakibenita.com | 6 years ago

How We Replaced Dozens of Test Fixtures With One Simple Function

It all started when we added feature flags to our app. After some deliberation we created a "feature set" model with boolean fields for each feature: class FeatureSet(models.Model): name = models.CharField(max_length=50) can_pay_with_credit_card = models.BooleanField() can_ … | Continue reading


@hakibenita.com | 6 years ago

How to Manage Concurrency in Django Models

The days of desktop systems serving single users are long gone. Web applications nowadays are serving millions of users at the same time. With many users comes a wide range of new problems: concurrency problems. The Problem To demonstrate common concurrency issues we are going to … | Continue reading


@hakibenita.com | 6 years ago

5 Ways to Make Django Admin Safer

In this article I present 5 ways to protect the Django Admin from human errors and attackers. Table of Contents Change the URL Visually Distinguish Environments Name Your Admin Site Separate the Django Admin From The Main Site Add Two Factor Authentication (2FA) Final Words … | Continue reading


@hakibenita.com | 6 years ago

The Many Faces of DISTINCT in PostgreSQL

I started my programming career as an Oracle DBA. It took a few years but eventually I got fed up with the corporate world and I went about doing my own thing. When I no longer had the comfy cushion of Oracle enterprise edition I discovered PostgreSQL. After I gotten over not hav … | Continue reading


@hakibenita.com | 6 years ago

All You Need To Know About Prefetching in Django

I have recently worked on a ticket ordering system for a conference. It was very important for the customer to see a table of orders including a column with a list of program names in each order: The column requested by the users The models looked (roughly) like this: class Progr … | Continue reading


@hakibenita.com | 7 years ago