Dec. 3, 2024
Sooner or later, every developer will face the need to optimize the application. It is not the simplest one and can often be carried out in several different ways. Today, we will look at a function that will often speed uploading a page written with Django - select_related.
houses = House.objects.filter(param=value)
then we need the information to complete the table with the basic parameters of the houses found, e.g.:
roof = house.roof.angle (kąt nachylenia dachu)
heating_type = house.heating_type.name (rodzaj ogrzewania)
garage = house.garage.area() (powierzchnia garażu)
If we were lucky and one object appears in the search results, the page will refresh in the blink of an eye. Remember, however, that under cover of the above Python functions, there are 4 database queries.
SELECT ••• FROM app_name_house WHERE "house"."param" = 'value'
At this point, we have a house list containing one house object, however its fields: roof, heating_type, and the garage only contain the id information of these objects in the corresponding table; let's assume they are X, Y, Z, respectively. Three queries are sent from the houses in the houses list:
roof:SELECT ••• FROM app_name_roof WHERE "roof"."id" = X
heating_type:SELECT ••• FROM app_name_heating WHERE "heating_type"."id" = Y
garage: SELECT ••• FROM app_name_garage WHERE "model_name"."id" = Z
It does not look very dangerous; after all, what are four queries for today's data carriers? However, we need to look at this issue from a broader perspective. We cannot assume that the result will contain one object; we will operate on the list of objects much more often, for example, one hundred, one thousand, 10,000 or more, and this is where - remaining in the construction topics - “stairs” appear. Assuming that the list of results includes 1000 houses, and we need three parameters for each of them, the number of queries will increase to 3001, which will noticeably slow down the page. The exact value depends, among other things, on the "size" of the objects. It may be surprising that in the discussed example, the type of hard disk used in the computer does not have a major impact on the course of the operation, which I checked on 3 types of disks:
HDD |
SSD I |
SSD II |
|
Number of inquiries |
3001 |
||
Page load time (s) |
64.87 |
63.49 |
63.18 |
Time of query execution (s) |
0.32 |
0.30 |
0.29 |
I used the Django Debug Toolbar for time measurement
Probably many people's intuition tells them that something is wrong here; modern computers can't perform such simple operations for several dozen seconds. This is, in fact, the case; it's a perfect example of using the select_related function. Let's make a query to the database again, this time using the mentioned function
houses = House.objects.select_related('roof', 'heating_type', 'garage').filter(param=value)
It turns out that the number of queries to the database has decreased to ... one, which looks like this:
SELECT ••• FROM "app_name_house" INNER JOIN "app_name_furniture" ON
("app_name_house"."roof_id" = "app_name_roof"."id")
INNER JOIN "app_test_heatingtype" ON
("app_name_house"."heating_type_id" = "app_name_heatingtype"."id")
INNER JOIN "app_name_room" ON
("app_name_house"."garage_id" = "app_name_garage"."id")
WHERE "app_name_house"."param" = 'value'
The page loading time has also decreased significantly:
|
HDD |
SSD I |
SSD II |
Number of inquiries |
1 |
||
Page load time (s) |
0.00248 |
0.00264 |
0.00251 |
Time of query execution (s) |
0.001 |
0.001 |
0.001 |
Looking at the table above, you may be tempted to use the select_related function always and everywhere or, for example, create a list of ForeignKey fields of a given object and pass it as an argument to the function. Unfortunately, it is not that simple and will likely produce the opposite effect to what is expected. It may turn out that despite the same number of inquiries, we will extract a lot of data that we do not need now, which will eventually slow down the page. To illustrate this phenomenon, I decided to create 10,000 House model objects, add a text box to the 10 models referenced by this model, then fill it with one page of "Lorem ipsum" and pull it out with one query. The differences in the operation time are presented in the table below.
Number of inqueries |
1 |
|||||||
Number of retrieved objects |
10000 |
|||||||
The number of ForeignKey fields |
13 |
|||||||
Using fields of ForeignKey |
0 |
3 |
||||||
The fields specified in the select_related function |
0 |
3 |
8 |
14 |
3 |
8 |
13 |
|
Loading time after filling in the text fields (s) |
0.42 |
1.17 |
2.41 |
3.05 |
1.66 |
2.67 |
3.33 |
What is worth remembering?