app/django/utils/stopwords.py
author Sverre Rabbelier <srabbelier@gmail.com>
Sun, 01 Feb 2009 22:48:48 +0000
changeset 1166 558bd62ee9d4
parent 54 03e267d67478
permissions -rw-r--r--
Fix get args construction when there are multiple lists on the page It is now possible to go back and forward through the liast, and specify the limit (both offset and limit can be done per list). The JS driving the list boxes is buggy, if visiting an url like: http://localhost:8080/notification/list?limit_0=10 And then change the limit in the second checkbox, it directs to: http://localhost:8080/notification/list?limit_1=25 Whereas it should redirect to: http://localhost:8080/notification/list?limit_0=10&limit_1=25 The logic _does_ work properly when the limit of the changed list is already present in the url. Patch by: Sverre Rabbelier

# Performance note: I benchmarked this code using a set instead of
# a list for the stopwords and was surprised to find that the list
# performed /better/ than the set - maybe because it's only a small
# list.

stopwords = '''
i
a
an
are
as
at
be
by
for
from
how
in
is
it
of
on
or
that
the
this
to
was
what
when
where
'''.split()

def strip_stopwords(sentence):
    "Removes stopwords - also normalizes whitespace"
    words = sentence.split()
    sentence = []
    for word in words:
        if word.lower() not in stopwords:
            sentence.append(word)
    return u' '.join(sentence)