parts/django/docs/topics/db/optimization.txt
changeset 307 c6bca38c1cbf
equal deleted inserted replaced
306:5ff1fc726848 307:c6bca38c1cbf
       
     1 ============================
       
     2 Database access optimization
       
     3 ============================
       
     4 
       
     5 Django's database layer provides various ways to help developers get the most
       
     6 out of their databases. This document gathers together links to the relevant
       
     7 documentation, and adds various tips, organized under a number of headings that
       
     8 outline the steps to take when attempting to optimize your database usage.
       
     9 
       
    10 Profile first
       
    11 =============
       
    12 
       
    13 As general programming practice, this goes without saying. Find out :ref:`what
       
    14 queries you are doing and what they are costing you
       
    15 <faq-see-raw-sql-queries>`. You may also want to use an external project like
       
    16 django-debug-toolbar_, or a tool that monitors your database directly.
       
    17 
       
    18 Remember that you may be optimizing for speed or memory or both, depending on
       
    19 your requirements. Sometimes optimizing for one will be detrimental to the
       
    20 other, but sometimes they will help each other. Also, work that is done by the
       
    21 database process might not have the same cost (to you) as the same amount of
       
    22 work done in your Python process. It is up to you to decide what your
       
    23 priorities are, where the balance must lie, and profile all of these as required
       
    24 since this will depend on your application and server.
       
    25 
       
    26 With everything that follows, remember to profile after every change to ensure
       
    27 that the change is a benefit, and a big enough benefit given the decrease in
       
    28 readability of your code. **All** of the suggestions below come with the caveat
       
    29 that in your circumstances the general principle might not apply, or might even
       
    30 be reversed.
       
    31 
       
    32 .. _django-debug-toolbar: http://robhudson.github.com/django-debug-toolbar/
       
    33 
       
    34 Use standard DB optimization techniques
       
    35 =======================================
       
    36 
       
    37 ...including:
       
    38 
       
    39 * Indexes. This is a number one priority, *after* you have determined from
       
    40   profiling what indexes should be added. Use
       
    41   :attr:`django.db.models.Field.db_index` to add these from Django.
       
    42 
       
    43 * Appropriate use of field types.
       
    44 
       
    45 We will assume you have done the obvious things above. The rest of this document
       
    46 focuses on how to use Django in such a way that you are not doing unnecessary
       
    47 work. This document also does not address other optimization techniques that
       
    48 apply to all expensive operations, such as :doc:`general purpose caching
       
    49 </topics/cache>`.
       
    50 
       
    51 Understand QuerySets
       
    52 ====================
       
    53 
       
    54 Understanding :doc:`QuerySets </ref/models/querysets>` is vital to getting good
       
    55 performance with simple code. In particular:
       
    56 
       
    57 Understand QuerySet evaluation
       
    58 ------------------------------
       
    59 
       
    60 To avoid performance problems, it is important to understand:
       
    61 
       
    62 * that :ref:`QuerySets are lazy <querysets-are-lazy>`.
       
    63 
       
    64 * when :ref:`they are evaluated <when-querysets-are-evaluated>`.
       
    65 
       
    66 * how :ref:`the data is held in memory <caching-and-querysets>`.
       
    67 
       
    68 Understand cached attributes
       
    69 ----------------------------
       
    70 
       
    71 As well as caching of the whole ``QuerySet``, there is caching of the result of
       
    72 attributes on ORM objects. In general, attributes that are not callable will be
       
    73 cached. For example, assuming the :ref:`example Weblog models
       
    74 <queryset-model-example>`::
       
    75 
       
    76   >>> entry = Entry.objects.get(id=1)
       
    77   >>> entry.blog   # Blog object is retrieved at this point
       
    78   >>> entry.blog   # cached version, no DB access
       
    79 
       
    80 But in general, callable attributes cause DB lookups every time::
       
    81 
       
    82   >>> entry = Entry.objects.get(id=1)
       
    83   >>> entry.authors.all()   # query performed
       
    84   >>> entry.authors.all()   # query performed again
       
    85 
       
    86 Be careful when reading template code - the template system does not allow use
       
    87 of parentheses, but will call callables automatically, hiding the above
       
    88 distinction.
       
    89 
       
    90 Be careful with your own custom properties - it is up to you to implement
       
    91 caching.
       
    92 
       
    93 Use the ``with`` template tag
       
    94 -----------------------------
       
    95 
       
    96 To make use of the caching behaviour of ``QuerySet``, you may need to use the
       
    97 :ttag:`with` template tag.
       
    98 
       
    99 Use ``iterator()``
       
   100 ------------------
       
   101 
       
   102 When you have a lot of objects, the caching behaviour of the ``QuerySet`` can
       
   103 cause a large amount of memory to be used. In this case,
       
   104 :meth:`~django.db.models.QuerySet.iterator()` may help.
       
   105 
       
   106 Do database work in the database rather than in Python
       
   107 ======================================================
       
   108 
       
   109 For instance:
       
   110 
       
   111 * At the most basic level, use :ref:`filter and exclude <queryset-api>` to do
       
   112   filtering in the database.
       
   113 
       
   114 * Use :ref:`F() object query expressions <query-expressions>` to do filtering
       
   115   against other fields within the same model.
       
   116 
       
   117 * Use :doc:`annotate to do aggregation in the database </topics/db/aggregation>`.
       
   118 
       
   119 If these aren't enough to generate the SQL you need:
       
   120 
       
   121 Use ``QuerySet.extra()``
       
   122 ------------------------
       
   123 
       
   124 A less portable but more powerful method is
       
   125 :meth:`~django.db.models.QuerySet.extra()`, which allows some SQL to be
       
   126 explicitly added to the query. If that still isn't powerful enough:
       
   127 
       
   128 Use raw SQL
       
   129 -----------
       
   130 
       
   131 Write your own :doc:`custom SQL to retrieve data or populate models
       
   132 </topics/db/sql>`. Use ``django.db.connection.queries`` to find out what Django
       
   133 is writing for you and start from there.
       
   134 
       
   135 Retrieve everything at once if you know you will need it
       
   136 ========================================================
       
   137 
       
   138 Hitting the database multiple times for different parts of a single 'set' of
       
   139 data that you will need all parts of is, in general, less efficient than
       
   140 retrieving it all in one query. This is particularly important if you have a
       
   141 query that is executed in a loop, and could therefore end up doing many database
       
   142 queries, when only one was needed. So:
       
   143 
       
   144 Use ``QuerySet.select_related()``
       
   145 ---------------------------------
       
   146 
       
   147 Understand :ref:`QuerySet.select_related() <select-related>` thoroughly, and use it:
       
   148 
       
   149 * in view code,
       
   150 
       
   151 * and in :doc:`managers and default managers </topics/db/managers>` where
       
   152   appropriate. Be aware when your manager is and is not used; sometimes this is
       
   153   tricky so don't make assumptions.
       
   154 
       
   155 Don't retrieve things you don't need
       
   156 ====================================
       
   157 
       
   158 Use ``QuerySet.values()`` and ``values_list()``
       
   159 -----------------------------------------------
       
   160 
       
   161 When you just want a ``dict`` or ``list`` of values, and don't need ORM model
       
   162 objects, make appropriate usage of :meth:`~django.db.models.QuerySet.values()`.
       
   163 These can be useful for replacing model objects in template code - as long as
       
   164 the dicts you supply have the same attributes as those used in the template,
       
   165 you are fine.
       
   166 
       
   167 Use ``QuerySet.defer()`` and ``only()``
       
   168 ---------------------------------------
       
   169 
       
   170 Use :meth:`~django.db.models.QuerySet.defer()` and
       
   171 :meth:`~django.db.models.QuerySet.only()` if there are database columns you
       
   172 know that you won't need (or won't need in most cases) to avoid loading
       
   173 them. Note that if you *do* use them, the ORM will have to go and get them in a
       
   174 separate query, making this a pessimization if you use it inappropriately.
       
   175 
       
   176 Use QuerySet.count()
       
   177 --------------------
       
   178 
       
   179 ...if you only want the count, rather than doing ``len(queryset)``.
       
   180 
       
   181 Use QuerySet.exists()
       
   182 ---------------------
       
   183 
       
   184 ...if you only want to find out if at least one result exists, rather than ``if
       
   185 queryset``.
       
   186 
       
   187 But:
       
   188 
       
   189 Don't overuse ``count()`` and ``exists()``
       
   190 ------------------------------------------
       
   191 
       
   192 If you are going to need other data from the QuerySet, just evaluate it.
       
   193 
       
   194 For example, assuming an Email class that has a ``body`` attribute and a
       
   195 many-to-many relation to User, the following template code is optimal:
       
   196 
       
   197 .. code-block:: html+django
       
   198 
       
   199    {% if display_inbox %}
       
   200      {% with user.emails.all as emails %}
       
   201        {% if emails %}
       
   202          <p>You have {{ emails|length }} email(s)</p>
       
   203          {% for email in emails %}
       
   204            <p>{{ email.body }}</p>
       
   205          {% endfor %}
       
   206        {% else %}
       
   207          <p>No messages today.</p>
       
   208        {% endif %}
       
   209      {% endwith %}
       
   210    {% endif %}
       
   211 
       
   212 
       
   213 It is optimal because:
       
   214 
       
   215  1. Since QuerySets are lazy, this does no database if 'display_inbox' is False.
       
   216 
       
   217  #. Use of ``with`` means that we store ``user.emails.all`` in a variable for
       
   218     later use, allowing its cache to be re-used.
       
   219 
       
   220  #. The line ``{% if emails %}`` causes ``QuerySet.__nonzero__()`` to be called,
       
   221     which causes the ``user.emails.all()`` query to be run on the database, and
       
   222     at the least the first line to be turned into an ORM object. If there aren't
       
   223     any results, it will return False, otherwise True.
       
   224 
       
   225  #. The use of ``{{ emails|length }}`` calls ``QuerySet.__len__()``, filling
       
   226     out the rest of the cache without doing another query.
       
   227 
       
   228  #. The ``for`` loop iterates over the already filled cache.
       
   229 
       
   230 In total, this code does either one or zero database queries. The only
       
   231 deliberate optimization performed is the use of the ``with`` tag. Using
       
   232 ``QuerySet.exists()`` or ``QuerySet.count()`` at any point would cause
       
   233 additional queries.
       
   234 
       
   235 Use ``QuerySet.update()`` and ``delete()``
       
   236 ------------------------------------------
       
   237 
       
   238 Rather than retrieve a load of objects, set some values, and save them
       
   239 individual, use a bulk SQL UPDATE statement, via :ref:`QuerySet.update()
       
   240 <topics-db-queries-update>`. Similarly, do :ref:`bulk deletes
       
   241 <topics-db-queries-delete>` where possible.
       
   242 
       
   243 Note, however, that these bulk update methods cannot call the ``save()`` or
       
   244 ``delete()`` methods of individual instances, which means that any custom
       
   245 behaviour you have added for these methods will not be executed, including
       
   246 anything driven from the normal database object :doc:`signals </ref/signals>`.
       
   247 
       
   248 Use foreign key values directly
       
   249 -------------------------------
       
   250 
       
   251 If you only need a foreign key value, use the foreign key value that is already on
       
   252 the object you've got, rather than getting the whole related object and taking
       
   253 its primary key. i.e. do::
       
   254 
       
   255    entry.blog_id
       
   256 
       
   257 instead of::
       
   258 
       
   259    entry.blog.id
       
   260