Query optimization in django using select_related() and prefetch_related()

Sachin Gadhari
4 min readDec 20, 2020

This post will cover how to optimize a query using select_related and prefetch_releated for better performance of rest API.

As our app grows and the dataset gets larger and larger, we need to make sure that the serializer or the view sets that we wrote should be well optimized and should not have any N+1 query problem.

Now let’s take an example to understand what I meant by the N+1 query problem. Let's say we have the below models in our models.py

And we have our serializers and api_views structured as below

Now you might be wondering what’s an issue with the code we just saw?

So let me show you the problem we have here. In our serializers, we have nested PackagesListSerializer within the StudentPackagesListSerializer which will eventually be executing separate queries for each foreign key related object.

Suppose we have 1000 objects in our studentpackage table and 1000 objects in our package table. Then here 1 query will be executed to get all items from studentpackage and while iterating through each object of studentpackage a separate query is executed to get the package details which results in 1001 queries (N+1).

To avoid the above problem we can make use of either select_related() or prefetch_related() which is provided by the Django framework.

select_related

We can use select_related on any ForeignKey or OneToOne relation fields in our models. It will return a single but more complex query by performing SQL join operation between the two tables which follow the foreign-key or onetoone relations.

Let's use the select_related on our queryset to fetch the package details upfront, which will join the studentpackage and package tables and remove separate lookup for each package, which means we reduced the number of queries from 1001 to 1.

class StudentPackageListView(ModelViewSet):
serializer_class = StudentPackagesListSerializer
def get_queryset(self):
queryset = StudentPackage.objects.all() \
.select_related('package')
return queryset

prefetch_related

It also works similar to select_related(), but the strategy is quite different here.

The main difference in both the method is select_related() returns a larger result set that contains all the related objects in the single database query whereas prefetch_related() does a separate lookup for each relationship, and does the ‘joining’ in Python.

Let's use prefetch_related() and see what it does behind the scene

class StudentPackageListView(ModelViewSet):
serializer_class = StudentPackagesListSerializer
def get_queryset(self):
queryset = StudentPackage.objects.all() \
.prefetch_related('package')
return queryset

As shown in the above image, now we are fetching all the related package objects in a separate database query, which removed that 1000 separate lookups for the package table from the serializer.

We can also use reverse lookup with prefetch_related(). In our example, if we want to fetch the group details in the same API response wher we don't have a direct relationship with the group table, but it has foreign-key relationship with studentpackage table.

In this case, we can pass related_name of the field from the group table for a related object which has a foreign key relationship with studentpackage table to prefetch_related().

class StudentPackageListView(ModelViewSet):
serializer_class = StudentPackagesListSerializer
def get_queryset(self):
queryset = StudentPackage.objects.all() \
.prefetch_related(
'package',
'group_studentpackages'
)
return queryset

Now we will get the queryset which will fetch the related objects for package and group tables in separate lookups.

There are a few things that we should consider before going for select_related and prefetch_related. select_related is limited to single-valued relationships — foreign key and one-to-one and it also returns the huge queryset with all related objects whereas in prefetch_related you don't need to worry about that as it will be doing a separate lookup for all related objects and you can use it with all type of relationships like a foreign key, one-to-one and many-to-many.

It is always a good practice that we can perform database operations before passing the data to the serializers.

--

--