|
1 ============ |
|
2 File Uploads |
|
3 ============ |
|
4 |
|
5 .. currentmodule:: django.core.files |
|
6 |
|
7 .. versionadded:: 1.0 |
|
8 |
|
9 When Django handles a file upload, the file data ends up placed in |
|
10 :attr:`request.FILES <django.http.HttpRequest.FILES>` (for more on the |
|
11 ``request`` object see the documentation for :doc:`request and response objects |
|
12 </ref/request-response>`). This document explains how files are stored on disk |
|
13 and in memory, and how to customize the default behavior. |
|
14 |
|
15 Basic file uploads |
|
16 ================== |
|
17 |
|
18 Consider a simple form containing a :class:`~django.forms.FileField`:: |
|
19 |
|
20 from django import forms |
|
21 |
|
22 class UploadFileForm(forms.Form): |
|
23 title = forms.CharField(max_length=50) |
|
24 file = forms.FileField() |
|
25 |
|
26 A view handling this form will receive the file data in |
|
27 :attr:`request.FILES <django.http.HttpRequest.FILES>`, which is a dictionary |
|
28 containing a key for each :class:`~django.forms.FileField` (or |
|
29 :class:`~django.forms.ImageField`, or other :class:`~django.forms.FileField` |
|
30 subclass) in the form. So the data from the above form would |
|
31 be accessible as ``request.FILES['file']``. |
|
32 |
|
33 Note that :attr:`request.FILES <django.http.HttpRequest.FILES>` will only |
|
34 contain data if the request method was ``POST`` and the ``<form>`` that posted |
|
35 the request has the attribute ``enctype="multipart/form-data"``. Otherwise, |
|
36 ``request.FILES`` will be empty. |
|
37 |
|
38 Most of the time, you'll simply pass the file data from ``request`` into the |
|
39 form as described in :ref:`binding-uploaded-files`. This would look |
|
40 something like:: |
|
41 |
|
42 from django.http import HttpResponseRedirect |
|
43 from django.shortcuts import render_to_response |
|
44 |
|
45 # Imaginary function to handle an uploaded file. |
|
46 from somewhere import handle_uploaded_file |
|
47 |
|
48 def upload_file(request): |
|
49 if request.method == 'POST': |
|
50 form = UploadFileForm(request.POST, request.FILES) |
|
51 if form.is_valid(): |
|
52 handle_uploaded_file(request.FILES['file']) |
|
53 return HttpResponseRedirect('/success/url/') |
|
54 else: |
|
55 form = UploadFileForm() |
|
56 return render_to_response('upload.html', {'form': form}) |
|
57 |
|
58 Notice that we have to pass :attr:`request.FILES <django.http.HttpRequest.FILES>` |
|
59 into the form's constructor; this is how file data gets bound into a form. |
|
60 |
|
61 Handling uploaded files |
|
62 ----------------------- |
|
63 |
|
64 The final piece of the puzzle is handling the actual file data from |
|
65 :attr:`request.FILES <django.http.HttpRequest.FILES>`. Each entry in this |
|
66 dictionary is an ``UploadedFile`` object -- a simple wrapper around an uploaded |
|
67 file. You'll usually use one of these methods to access the uploaded content: |
|
68 |
|
69 ``UploadedFile.read()`` |
|
70 Read the entire uploaded data from the file. Be careful with this |
|
71 method: if the uploaded file is huge it can overwhelm your system if you |
|
72 try to read it into memory. You'll probably want to use ``chunks()`` |
|
73 instead; see below. |
|
74 |
|
75 ``UploadedFile.multiple_chunks()`` |
|
76 Returns ``True`` if the uploaded file is big enough to require |
|
77 reading in multiple chunks. By default this will be any file |
|
78 larger than 2.5 megabytes, but that's configurable; see below. |
|
79 |
|
80 ``UploadedFile.chunks()`` |
|
81 A generator returning chunks of the file. If ``multiple_chunks()`` is |
|
82 ``True``, you should use this method in a loop instead of ``read()``. |
|
83 |
|
84 In practice, it's often easiest simply to use ``chunks()`` all the time; |
|
85 see the example below. |
|
86 |
|
87 ``UploadedFile.name`` |
|
88 The name of the uploaded file (e.g. ``my_file.txt``). |
|
89 |
|
90 ``UploadedFile.size`` |
|
91 The size, in bytes, of the uploaded file. |
|
92 |
|
93 There are a few other methods and attributes available on ``UploadedFile`` |
|
94 objects; see `UploadedFile objects`_ for a complete reference. |
|
95 |
|
96 Putting it all together, here's a common way you might handle an uploaded file:: |
|
97 |
|
98 def handle_uploaded_file(f): |
|
99 destination = open('some/file/name.txt', 'wb+') |
|
100 for chunk in f.chunks(): |
|
101 destination.write(chunk) |
|
102 destination.close() |
|
103 |
|
104 Looping over ``UploadedFile.chunks()`` instead of using ``read()`` ensures that |
|
105 large files don't overwhelm your system's memory. |
|
106 |
|
107 Where uploaded data is stored |
|
108 ----------------------------- |
|
109 |
|
110 Before you save uploaded files, the data needs to be stored somewhere. |
|
111 |
|
112 By default, if an uploaded file is smaller than 2.5 megabytes, Django will hold |
|
113 the entire contents of the upload in memory. This means that saving the file |
|
114 involves only a read from memory and a write to disk and thus is very fast. |
|
115 |
|
116 However, if an uploaded file is too large, Django will write the uploaded file |
|
117 to a temporary file stored in your system's temporary directory. On a Unix-like |
|
118 platform this means you can expect Django to generate a file called something |
|
119 like ``/tmp/tmpzfp6I6.upload``. If an upload is large enough, you can watch this |
|
120 file grow in size as Django streams the data onto disk. |
|
121 |
|
122 These specifics -- 2.5 megabytes; ``/tmp``; etc. -- are simply "reasonable |
|
123 defaults". Read on for details on how you can customize or completely replace |
|
124 upload behavior. |
|
125 |
|
126 Changing upload handler behavior |
|
127 -------------------------------- |
|
128 |
|
129 Three settings control Django's file upload behavior: |
|
130 |
|
131 :setting:`FILE_UPLOAD_MAX_MEMORY_SIZE` |
|
132 The maximum size, in bytes, for files that will be uploaded into memory. |
|
133 Files larger than :setting:`FILE_UPLOAD_MAX_MEMORY_SIZE` will be |
|
134 streamed to disk. |
|
135 |
|
136 Defaults to 2.5 megabytes. |
|
137 |
|
138 :setting:`FILE_UPLOAD_TEMP_DIR` |
|
139 The directory where uploaded files larger than |
|
140 :setting:`FILE_UPLOAD_MAX_MEMORY_SIZE` will be stored. |
|
141 |
|
142 Defaults to your system's standard temporary directory (i.e. ``/tmp`` on |
|
143 most Unix-like systems). |
|
144 |
|
145 :setting:`FILE_UPLOAD_PERMISSIONS` |
|
146 The numeric mode (i.e. ``0644``) to set newly uploaded files to. For |
|
147 more information about what these modes mean, see the `documentation for |
|
148 os.chmod`_ |
|
149 |
|
150 If this isn't given or is ``None``, you'll get operating-system |
|
151 dependent behavior. On most platforms, temporary files will have a mode |
|
152 of ``0600``, and files saved from memory will be saved using the |
|
153 system's standard umask. |
|
154 |
|
155 .. warning:: |
|
156 |
|
157 If you're not familiar with file modes, please note that the leading |
|
158 ``0`` is very important: it indicates an octal number, which is the |
|
159 way that modes must be specified. If you try to use ``644``, you'll |
|
160 get totally incorrect behavior. |
|
161 |
|
162 **Always prefix the mode with a 0.** |
|
163 |
|
164 :setting:`FILE_UPLOAD_HANDLERS` |
|
165 The actual handlers for uploaded files. Changing this setting allows |
|
166 complete customization -- even replacement -- of Django's upload |
|
167 process. See `upload handlers`_, below, for details. |
|
168 |
|
169 Defaults to:: |
|
170 |
|
171 ("django.core.files.uploadhandler.MemoryFileUploadHandler", |
|
172 "django.core.files.uploadhandler.TemporaryFileUploadHandler",) |
|
173 |
|
174 Which means "try to upload to memory first, then fall back to temporary |
|
175 files." |
|
176 |
|
177 .. _documentation for os.chmod: http://docs.python.org/library/os.html#os.chmod |
|
178 |
|
179 ``UploadedFile`` objects |
|
180 ======================== |
|
181 |
|
182 .. class:: UploadedFile |
|
183 |
|
184 In addition to those inherited from :class:`File`, all ``UploadedFile`` objects |
|
185 define the following methods/attributes: |
|
186 |
|
187 ``UploadedFile.content_type`` |
|
188 The content-type header uploaded with the file (e.g. ``text/plain`` or |
|
189 ``application/pdf``). Like any data supplied by the user, you shouldn't |
|
190 trust that the uploaded file is actually this type. You'll still need to |
|
191 validate that the file contains the content that the content-type header |
|
192 claims -- "trust but verify." |
|
193 |
|
194 ``UploadedFile.charset`` |
|
195 For ``text/*`` content-types, the character set (i.e. ``utf8``) supplied |
|
196 by the browser. Again, "trust but verify" is the best policy here. |
|
197 |
|
198 ``UploadedFile.temporary_file_path()`` |
|
199 Only files uploaded onto disk will have this method; it returns the full |
|
200 path to the temporary uploaded file. |
|
201 |
|
202 .. note:: |
|
203 |
|
204 Like regular Python files, you can read the file line-by-line simply by |
|
205 iterating over the uploaded file: |
|
206 |
|
207 .. code-block:: python |
|
208 |
|
209 for line in uploadedfile: |
|
210 do_something_with(line) |
|
211 |
|
212 However, *unlike* standard Python files, :class:`UploadedFile` only |
|
213 understands ``\n`` (also known as "Unix-style") line endings. If you know |
|
214 that you need to handle uploaded files with different line endings, you'll |
|
215 need to do so in your view. |
|
216 |
|
217 Upload Handlers |
|
218 =============== |
|
219 |
|
220 When a user uploads a file, Django passes off the file data to an *upload |
|
221 handler* -- a small class that handles file data as it gets uploaded. Upload |
|
222 handlers are initially defined in the ``FILE_UPLOAD_HANDLERS`` setting, which |
|
223 defaults to:: |
|
224 |
|
225 ("django.core.files.uploadhandler.MemoryFileUploadHandler", |
|
226 "django.core.files.uploadhandler.TemporaryFileUploadHandler",) |
|
227 |
|
228 Together the ``MemoryFileUploadHandler`` and ``TemporaryFileUploadHandler`` |
|
229 provide Django's default file upload behavior of reading small files into memory |
|
230 and large ones onto disk. |
|
231 |
|
232 You can write custom handlers that customize how Django handles files. You |
|
233 could, for example, use custom handlers to enforce user-level quotas, compress |
|
234 data on the fly, render progress bars, and even send data to another storage |
|
235 location directly without storing it locally. |
|
236 |
|
237 Modifying upload handlers on the fly |
|
238 ------------------------------------ |
|
239 |
|
240 Sometimes particular views require different upload behavior. In these cases, |
|
241 you can override upload handlers on a per-request basis by modifying |
|
242 ``request.upload_handlers``. By default, this list will contain the upload |
|
243 handlers given by ``FILE_UPLOAD_HANDLERS``, but you can modify the list as you |
|
244 would any other list. |
|
245 |
|
246 For instance, suppose you've written a ``ProgressBarUploadHandler`` that |
|
247 provides feedback on upload progress to some sort of AJAX widget. You'd add this |
|
248 handler to your upload handlers like this:: |
|
249 |
|
250 request.upload_handlers.insert(0, ProgressBarUploadHandler()) |
|
251 |
|
252 You'd probably want to use ``list.insert()`` in this case (instead of |
|
253 ``append()``) because a progress bar handler would need to run *before* any |
|
254 other handlers. Remember, the upload handlers are processed in order. |
|
255 |
|
256 If you want to replace the upload handlers completely, you can just assign a new |
|
257 list:: |
|
258 |
|
259 request.upload_handlers = [ProgressBarUploadHandler()] |
|
260 |
|
261 .. note:: |
|
262 |
|
263 You can only modify upload handlers *before* accessing |
|
264 ``request.POST`` or ``request.FILES`` -- it doesn't make sense to |
|
265 change upload handlers after upload handling has already |
|
266 started. If you try to modify ``request.upload_handlers`` after |
|
267 reading from ``request.POST`` or ``request.FILES`` Django will |
|
268 throw an error. |
|
269 |
|
270 Thus, you should always modify uploading handlers as early in your view as |
|
271 possible. |
|
272 |
|
273 Also, ``request.POST`` is accessed by |
|
274 :class:`~django.middleware.csrf.CsrfViewMiddleware` which is enabled by |
|
275 default. This means you will probably need to use |
|
276 :func:`~django.views.decorators.csrf.csrf_exempt` on your view to allow you |
|
277 to change the upload handlers. Assuming you do need CSRF protection, you |
|
278 will then need to use :func:`~django.views.decorators.csrf.csrf_protect` on |
|
279 the function that actually processes the request. Note that this means that |
|
280 the handlers may start receiving the file upload before the CSRF checks have |
|
281 been done. Example code: |
|
282 |
|
283 .. code-block:: python |
|
284 |
|
285 from django.views.decorators.csrf import csrf_exempt, csrf_protect |
|
286 |
|
287 @csrf_exempt |
|
288 def upload_file_view(request): |
|
289 request.upload_handlers.insert(0, ProgressBarUploadHandler()) |
|
290 return _upload_file_view(request) |
|
291 |
|
292 @csrf_protect |
|
293 def _upload_file_view(request): |
|
294 ... # Process request |
|
295 |
|
296 |
|
297 Writing custom upload handlers |
|
298 ------------------------------ |
|
299 |
|
300 All file upload handlers should be subclasses of |
|
301 ``django.core.files.uploadhandler.FileUploadHandler``. You can define upload |
|
302 handlers wherever you wish. |
|
303 |
|
304 Required methods |
|
305 ~~~~~~~~~~~~~~~~ |
|
306 |
|
307 Custom file upload handlers **must** define the following methods: |
|
308 |
|
309 ``FileUploadHandler.receive_data_chunk(self, raw_data, start)`` |
|
310 Receives a "chunk" of data from the file upload. |
|
311 |
|
312 ``raw_data`` is a byte string containing the uploaded data. |
|
313 |
|
314 ``start`` is the position in the file where this ``raw_data`` chunk |
|
315 begins. |
|
316 |
|
317 The data you return will get fed into the subsequent upload handlers' |
|
318 ``receive_data_chunk`` methods. In this way, one handler can be a |
|
319 "filter" for other handlers. |
|
320 |
|
321 Return ``None`` from ``receive_data_chunk`` to sort-circuit remaining |
|
322 upload handlers from getting this chunk.. This is useful if you're |
|
323 storing the uploaded data yourself and don't want future handlers to |
|
324 store a copy of the data. |
|
325 |
|
326 If you raise a ``StopUpload`` or a ``SkipFile`` exception, the upload |
|
327 will abort or the file will be completely skipped. |
|
328 |
|
329 ``FileUploadHandler.file_complete(self, file_size)`` |
|
330 Called when a file has finished uploading. |
|
331 |
|
332 The handler should return an ``UploadedFile`` object that will be stored |
|
333 in ``request.FILES``. Handlers may also return ``None`` to indicate that |
|
334 the ``UploadedFile`` object should come from subsequent upload handlers. |
|
335 |
|
336 Optional methods |
|
337 ~~~~~~~~~~~~~~~~ |
|
338 |
|
339 Custom upload handlers may also define any of the following optional methods or |
|
340 attributes: |
|
341 |
|
342 ``FileUploadHandler.chunk_size`` |
|
343 Size, in bytes, of the "chunks" Django should store into memory and feed |
|
344 into the handler. That is, this attribute controls the size of chunks |
|
345 fed into ``FileUploadHandler.receive_data_chunk``. |
|
346 |
|
347 For maximum performance the chunk sizes should be divisible by ``4`` and |
|
348 should not exceed 2 GB (2\ :sup:`31` bytes) in size. When there are |
|
349 multiple chunk sizes provided by multiple handlers, Django will use the |
|
350 smallest chunk size defined by any handler. |
|
351 |
|
352 The default is 64*2\ :sup:`10` bytes, or 64 KB. |
|
353 |
|
354 ``FileUploadHandler.new_file(self, field_name, file_name, content_type, content_length, charset)`` |
|
355 Callback signaling that a new file upload is starting. This is called |
|
356 before any data has been fed to any upload handlers. |
|
357 |
|
358 ``field_name`` is a string name of the file ``<input>`` field. |
|
359 |
|
360 ``file_name`` is the unicode filename that was provided by the browser. |
|
361 |
|
362 ``content_type`` is the MIME type provided by the browser -- E.g. |
|
363 ``'image/jpeg'``. |
|
364 |
|
365 ``content_length`` is the length of the image given by the browser. |
|
366 Sometimes this won't be provided and will be ``None``. |
|
367 |
|
368 ``charset`` is the character set (i.e. ``utf8``) given by the browser. |
|
369 Like ``content_length``, this sometimes won't be provided. |
|
370 |
|
371 This method may raise a ``StopFutureHandlers`` exception to prevent |
|
372 future handlers from handling this file. |
|
373 |
|
374 ``FileUploadHandler.upload_complete(self)`` |
|
375 Callback signaling that the entire upload (all files) has completed. |
|
376 |
|
377 ``FileUploadHandler.handle_raw_input(self, input_data, META, content_length, boundary, encoding)`` |
|
378 Allows the handler to completely override the parsing of the raw |
|
379 HTTP input. |
|
380 |
|
381 ``input_data`` is a file-like object that supports ``read()``-ing. |
|
382 |
|
383 ``META`` is the same object as ``request.META``. |
|
384 |
|
385 ``content_length`` is the length of the data in ``input_data``. Don't |
|
386 read more than ``content_length`` bytes from ``input_data``. |
|
387 |
|
388 ``boundary`` is the MIME boundary for this request. |
|
389 |
|
390 ``encoding`` is the encoding of the request. |
|
391 |
|
392 Return ``None`` if you want upload handling to continue, or a tuple of |
|
393 ``(POST, FILES)`` if you want to return the new data structures suitable |
|
394 for the request directly. |