|
1 ===================== |
|
2 The sitemap framework |
|
3 ===================== |
|
4 |
|
5 **New in Django development version**. |
|
6 |
|
7 Django comes with a high-level sitemap-generating framework that makes |
|
8 creating sitemap_ XML files easy. |
|
9 |
|
10 .. _sitemap: http://www.sitemaps.org/ |
|
11 |
|
12 Overview |
|
13 ======== |
|
14 |
|
15 A sitemap is an XML file on your Web site that tells search-engine indexers how |
|
16 frequently your pages change and how "important" certain pages are in relation |
|
17 to other pages on your site. This information helps search engines index your |
|
18 site. |
|
19 |
|
20 The Django sitemap framework automates the creation of this XML file by letting |
|
21 you express this information in Python code. |
|
22 |
|
23 It works much like Django's `syndication framework`_. To create a sitemap, just |
|
24 write a ``Sitemap`` class and point to it in your URLconf_. |
|
25 |
|
26 .. _syndication framework: ../syndication/ |
|
27 .. _URLconf: ../url_dispatch/ |
|
28 |
|
29 Installation |
|
30 ============ |
|
31 |
|
32 To install the sitemap app, follow these steps: |
|
33 |
|
34 1. Add ``'django.contrib.sitemaps'`` to your INSTALLED_APPS_ setting. |
|
35 2. Make sure ``'django.template.loaders.app_directories.load_template_source'`` |
|
36 is in your TEMPLATE_LOADERS_ setting. It's in there by default, so |
|
37 you'll only need to change this if you've changed that setting. |
|
38 3. Make sure you've installed the `sites framework`_. |
|
39 |
|
40 (Note: The sitemap application doesn't install any database tables. The only |
|
41 reason it needs to go into ``INSTALLED_APPS`` is so that the |
|
42 ``load_template_source`` template loader can find the default templates.) |
|
43 |
|
44 .. _INSTALLED_APPS: ../settings/#installed-apps |
|
45 .. _TEMPLATE_LOADERS: ../settings/#template-loaders |
|
46 .. _sites framework: ../sites/ |
|
47 |
|
48 Initialization |
|
49 ============== |
|
50 |
|
51 To activate sitemap generation on your Django site, add this line to your |
|
52 URLconf_: |
|
53 |
|
54 (r'^sitemap.xml$', 'django.contrib.sitemaps.views.sitemap', {'sitemaps': sitemaps}) |
|
55 |
|
56 This tells Django to build a sitemap when a client accesses ``/sitemap.xml``. |
|
57 |
|
58 The name of the sitemap file is not important, but the location is. Search |
|
59 engines will only index links in your sitemap for the current URL level and |
|
60 below. For instance, if ``sitemap.xml`` lives in your root directory, it may |
|
61 reference any URL in your site. However, if your sitemap lives at |
|
62 ``/content/sitemap.xml``, it may only reference URLs that begin with |
|
63 ``/content/``. |
|
64 |
|
65 The sitemap view takes an extra, required argument: ``{'sitemaps': sitemaps}``. |
|
66 ``sitemaps`` should be a dictionary that maps a short section label (e.g., |
|
67 ``blog`` or ``news``) to its ``Sitemap`` class (e.g., ``BlogSitemap`` or |
|
68 ``NewsSitemap``). It may also map to an *instance* of a ``Sitemap`` class |
|
69 (e.g., ``BlogSitemap(some_var)``). |
|
70 |
|
71 .. _URLconf: ../url_dispatch/ |
|
72 |
|
73 Sitemap classes |
|
74 =============== |
|
75 |
|
76 A ``Sitemap`` class is a simple Python class that represents a "section" of |
|
77 entries in your sitemap. For example, one ``Sitemap`` class could represent all |
|
78 the entries of your weblog, while another could represent all of the events in |
|
79 your events calendar. |
|
80 |
|
81 In the simplest case, all these sections get lumped together into one |
|
82 ``sitemap.xml``, but it's also possible to use the framework to generate a |
|
83 sitemap index that references individual sitemap files, one per section. (See |
|
84 `Creating a sitemap index`_ below.) |
|
85 |
|
86 ``Sitemap`` classes must subclass ``django.contrib.sitemaps.Sitemap``. They can |
|
87 live anywhere in your codebase. |
|
88 |
|
89 A simple example |
|
90 ================ |
|
91 |
|
92 Let's assume you have a blog system, with an ``Entry`` model, and you want your |
|
93 sitemap to include all the links to your individual blog entries. Here's how |
|
94 your sitemap class might look:: |
|
95 |
|
96 from django.contrib.sitemaps import Sitemap |
|
97 from mysite.blog.models import Entry |
|
98 |
|
99 class BlogSitemap(Sitemap): |
|
100 changefreq = "never" |
|
101 priority = 0.5 |
|
102 |
|
103 def items(self): |
|
104 return Entry.objects.filter(is_draft=False) |
|
105 |
|
106 def lastmod(self, obj): |
|
107 return obj.pub_date |
|
108 |
|
109 Note: |
|
110 |
|
111 * ``changefreq`` and ``priority`` are class attributes corresponding to |
|
112 ``<changefreq>`` and ``<priority>`` elements, respectively. They can be |
|
113 made callable as functions, as ``lastmod`` was in the example. |
|
114 * ``items()`` is simply a method that returns a list of objects. The objects |
|
115 returned will get passed to any callable methods corresponding to a |
|
116 sitemap property (``location``, ``lastmod``, ``changefreq``, and |
|
117 ``priority``). |
|
118 * ``lastmod`` should return a Python ``datetime`` object. |
|
119 * There is no ``location`` method in this example, but you can provide it |
|
120 in order to specify the URL for your object. By default, ``location()`` |
|
121 calls ``get_absolute_url()`` on each object and returns the result. |
|
122 |
|
123 Sitemap class reference |
|
124 ======================= |
|
125 |
|
126 A ``Sitemap`` class can define the following methods/attributes: |
|
127 |
|
128 ``items`` |
|
129 --------- |
|
130 |
|
131 **Required.** A method that returns a list of objects. The framework doesn't |
|
132 care what *type* of objects they are; all that matters is that these objects |
|
133 get passed to the ``location()``, ``lastmod()``, ``changefreq()`` and |
|
134 ``priority()`` methods. |
|
135 |
|
136 ``location`` |
|
137 ------------ |
|
138 |
|
139 **Optional.** Either a method or attribute. |
|
140 |
|
141 If it's a method, it should return the absolute URL for a given object as |
|
142 returned by ``items()``. |
|
143 |
|
144 If it's an attribute, its value should be a string representing an absolute URL |
|
145 to use for *every* object returned by ``items()``. |
|
146 |
|
147 In both cases, "absolute URL" means a URL that doesn't include the protocol or |
|
148 domain. Examples: |
|
149 |
|
150 * Good: ``'/foo/bar/'`` |
|
151 * Bad: ``'example.com/foo/bar/'`` |
|
152 * Bad: ``'http://example.com/foo/bar/'`` |
|
153 |
|
154 If ``location`` isn't provided, the framework will call the |
|
155 ``get_absolute_url()`` method on each object as returned by ``items()``. |
|
156 |
|
157 ``lastmod`` |
|
158 ----------- |
|
159 |
|
160 **Optional.** Either a method or attribute. |
|
161 |
|
162 If it's a method, it should take one argument -- an object as returned by |
|
163 ``items()`` -- and return that object's last-modified date/time, as a Python |
|
164 ``datetime.datetime`` object. |
|
165 |
|
166 If it's an attribute, its value should be a Python ``datetime.datetime`` object |
|
167 representing the last-modified date/time for *every* object returned by |
|
168 ``items()``. |
|
169 |
|
170 ``changefreq`` |
|
171 -------------- |
|
172 |
|
173 **Optional.** Either a method or attribute. |
|
174 |
|
175 If it's a method, it should take one argument -- an object as returned by |
|
176 ``items()`` -- and return that object's change frequency, as a Python string. |
|
177 |
|
178 If it's an attribute, its value should be a string representing the change |
|
179 frequency of *every* object returned by ``items()``. |
|
180 |
|
181 Possible values for ``changefreq``, whether you use a method or attribute, are: |
|
182 |
|
183 * ``'always'`` |
|
184 * ``'hourly'`` |
|
185 * ``'daily'`` |
|
186 * ``'weekly'`` |
|
187 * ``'monthly'`` |
|
188 * ``'yearly'`` |
|
189 * ``'never'`` |
|
190 |
|
191 ``priority`` |
|
192 ------------ |
|
193 |
|
194 **Optional.** Either a method or attribute. |
|
195 |
|
196 If it's a method, it should take one argument -- an object as returned by |
|
197 ``items()`` -- and return that object's priority, as either a string or float. |
|
198 |
|
199 If it's an attribute, its value should be either a string or float representing |
|
200 the priority of *every* object returned by ``items()``. |
|
201 |
|
202 Example values for ``priority``: ``0.4``, ``1.0``. The default priority of a |
|
203 page is ``0.5``. See the `sitemaps.org documentation`_ for more. |
|
204 |
|
205 .. _sitemaps.org documentation: http://www.sitemaps.org/protocol.html#prioritydef |
|
206 |
|
207 Shortcuts |
|
208 ========= |
|
209 |
|
210 The sitemap framework provides a couple convenience classes for common cases: |
|
211 |
|
212 ``FlatPageSitemap`` |
|
213 ------------------- |
|
214 |
|
215 The ``django.contrib.sitemaps.FlatPageSitemap`` class looks at all flatpages_ |
|
216 defined for the current ``SITE_ID`` (see the `sites documentation`_) and |
|
217 creates an entry in the sitemap. These entries include only the ``location`` |
|
218 attribute -- not ``lastmod``, ``changefreq`` or ``priority``. |
|
219 |
|
220 .. _flatpages: ../flatpages/ |
|
221 .. _sites documentation: ../sites/ |
|
222 |
|
223 ``GenericSitemap`` |
|
224 ------------------ |
|
225 |
|
226 The ``GenericSitemap`` class works with any `generic views`_ you already have. |
|
227 To use it, create an instance, passing in the same ``info_dict`` you pass to |
|
228 the generic views. The only requirement is that the dictionary have a |
|
229 ``queryset`` entry. It may also have a ``date_field`` entry that specifies a |
|
230 date field for objects retrieved from the ``queryset``. This will be used for |
|
231 the ``lastmod`` attribute in the generated sitemap. You may also pass |
|
232 ``priority`` and ``changefreq`` keyword arguments to the ``GenericSitemap`` |
|
233 constructor to specify these attributes for all URLs. |
|
234 |
|
235 .. _generic views: ../generic_views/ |
|
236 |
|
237 Example |
|
238 ------- |
|
239 |
|
240 Here's an example of a URLconf_ using both:: |
|
241 |
|
242 from django.conf.urls.defaults import * |
|
243 from django.contrib.sitemaps import FlatPageSitemap, GenericSitemap |
|
244 from mysite.blog.models import Entry |
|
245 |
|
246 info_dict = { |
|
247 'queryset': Entry.objects.all(), |
|
248 'date_field': 'pub_date', |
|
249 } |
|
250 |
|
251 sitemaps = { |
|
252 'flatpages': FlatPageSitemap, |
|
253 'blog': GenericSitemap(info_dict, priority=0.6), |
|
254 } |
|
255 |
|
256 urlpatterns = patterns('', |
|
257 # some generic view using info_dict |
|
258 # ... |
|
259 |
|
260 # the sitemap |
|
261 (r'^sitemap.xml$', 'django.contrib.sitemaps.views.sitemap', {'sitemaps': sitemaps}) |
|
262 ) |
|
263 |
|
264 .. _URLconf: ../url_dispatch/ |
|
265 |
|
266 Creating a sitemap index |
|
267 ======================== |
|
268 |
|
269 The sitemap framework also has the ability to create a sitemap index that |
|
270 references individual sitemap files, one per each section defined in your |
|
271 ``sitemaps`` dictionary. The only differences in usage are: |
|
272 |
|
273 * You use two views in your URLconf: ``django.contrib.sitemaps.views.index`` |
|
274 and ``django.contrib.sitemaps.views.sitemap``. |
|
275 * The ``django.contrib.sitemaps.views.sitemap`` view should take a |
|
276 ``section`` keyword argument. |
|
277 |
|
278 Here is what the relevant URLconf lines would look like for the example above:: |
|
279 |
|
280 (r'^sitemap.xml$', 'django.contrib.sitemaps.views.index', {'sitemaps': sitemaps}) |
|
281 (r'^sitemap-(?P<section>.+).xml$', 'django.contrib.sitemaps.views.sitemap', {'sitemaps': sitemaps}) |
|
282 |
|
283 This will automatically generate a ``sitemap.xml`` file that references |
|
284 both ``sitemap-flatpages.xml`` and ``sitemap-blog.xml``. The ``Sitemap`` |
|
285 classes and the ``sitemaps`` dict don't change at all. |
|
286 |
|
287 Pinging Google |
|
288 ============== |
|
289 |
|
290 You may want to "ping" Google when your sitemap changes, to let it know to |
|
291 reindex your site. The framework provides a function to do just that: |
|
292 ``django.contrib.sitemaps.ping_google()``. |
|
293 |
|
294 ``ping_google()`` takes an optional argument, ``sitemap_url``, which should be |
|
295 the absolute URL of your site's sitemap (e.g., ``'/sitemap.xml'``). If this |
|
296 argument isn't provided, ``ping_google()`` will attempt to figure out your |
|
297 sitemap by performing a reverse looking in your URLconf. |
|
298 |
|
299 ``ping_google()`` raises the exception |
|
300 ``django.contrib.sitemaps.SitemapNotFound`` if it cannot determine your sitemap |
|
301 URL. |
|
302 |
|
303 One useful way to call ``ping_google()`` is from a model's ``save()`` method:: |
|
304 |
|
305 from django.contrib.sitemaps import ping_google |
|
306 |
|
307 class Entry(models.Model): |
|
308 # ... |
|
309 def save(self): |
|
310 super(Entry, self).save() |
|
311 try: |
|
312 ping_google() |
|
313 except Exception: |
|
314 # Bare 'except' because we could get a variety |
|
315 # of HTTP-related exceptions. |
|
316 pass |
|
317 |
|
318 A more efficient solution, however, would be to call ``ping_google()`` from a |
|
319 cron script, or some other scheduled task. The function makes an HTTP request |
|
320 to Google's servers, so you may not want to introduce that network overhead |
|
321 each time you call ``save()``. |