app/django/utils/stopwords.py
author Pawel Solyga <Pawel.Solyga@gmail.com>
Sun, 19 Oct 2008 21:12:08 +0000
changeset 394 4c60652a3947
parent 54 03e267d67478
permissions -rw-r--r--
Add BaseForm class to soc.views.helper.forms module (work in progress). This changes the way as_table function displays the form (for more information have a look into doc string). BaseForm is going to be used for all forms in Melange in future. Right now it's still missing custom form errors labels and "required" text in 3rd column, but that's added as TODO and I'm working on it. Patch by: Pawel Solyga Review by: to-be-reviewed

# Performance note: I benchmarked this code using a set instead of
# a list for the stopwords and was surprised to find that the list
# performed /better/ than the set - maybe because it's only a small
# list.

stopwords = '''
i
a
an
are
as
at
be
by
for
from
how
in
is
it
of
on
or
that
the
this
to
was
what
when
where
'''.split()

def strip_stopwords(sentence):
    "Removes stopwords - also normalizes whitespace"
    words = sentence.split()
    sentence = []
    for word in words:
        if word.lower() not in stopwords:
            sentence.append(word)
    return u' '.join(sentence)