app/django/utils/encoding.py
author Todd Larsen <tlarsen@google.com>
Mon, 27 Oct 2008 19:19:44 +0000
changeset 426 114fe0f840c8
parent 323 ff1a9aa48cfd
permissions -rw-r--r--
Add tooltips style display of help_text, instead of widening the form with another (often ridiculously large) column. Also, version the .css and .js files so that browsers won't display the wrong cached one when we change something. Patch by: Dmitri Gaskin, Todd Larsen Review by: Todd Larsen
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
54
03e267d67478 Major reorganization of the soc svn repo, to merge into a single App Engine
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
     1
import types
03e267d67478 Major reorganization of the soc svn repo, to merge into a single App Engine
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
     2
import urllib
03e267d67478 Major reorganization of the soc svn repo, to merge into a single App Engine
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
     3
import datetime
03e267d67478 Major reorganization of the soc svn repo, to merge into a single App Engine
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
     4
from django.utils.functional import Promise
03e267d67478 Major reorganization of the soc svn repo, to merge into a single App Engine
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
     5
03e267d67478 Major reorganization of the soc svn repo, to merge into a single App Engine
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
     6
class DjangoUnicodeDecodeError(UnicodeDecodeError):
03e267d67478 Major reorganization of the soc svn repo, to merge into a single App Engine
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
     7
    def __init__(self, obj, *args):
03e267d67478 Major reorganization of the soc svn repo, to merge into a single App Engine
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
     8
        self.obj = obj
03e267d67478 Major reorganization of the soc svn repo, to merge into a single App Engine
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
     9
        UnicodeDecodeError.__init__(self, *args)
03e267d67478 Major reorganization of the soc svn repo, to merge into a single App Engine
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    10
03e267d67478 Major reorganization of the soc svn repo, to merge into a single App Engine
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    11
    def __str__(self):
03e267d67478 Major reorganization of the soc svn repo, to merge into a single App Engine
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    12
        original = UnicodeDecodeError.__str__(self)
03e267d67478 Major reorganization of the soc svn repo, to merge into a single App Engine
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    13
        return '%s. You passed in %r (%s)' % (original, self.obj,
03e267d67478 Major reorganization of the soc svn repo, to merge into a single App Engine
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    14
                type(self.obj))
03e267d67478 Major reorganization of the soc svn repo, to merge into a single App Engine
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    15
03e267d67478 Major reorganization of the soc svn repo, to merge into a single App Engine
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    16
class StrAndUnicode(object):
03e267d67478 Major reorganization of the soc svn repo, to merge into a single App Engine
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    17
    """
03e267d67478 Major reorganization of the soc svn repo, to merge into a single App Engine
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    18
    A class whose __str__ returns its __unicode__ as a UTF-8 bytestring.
03e267d67478 Major reorganization of the soc svn repo, to merge into a single App Engine
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    19
03e267d67478 Major reorganization of the soc svn repo, to merge into a single App Engine
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    20
    Useful as a mix-in.
03e267d67478 Major reorganization of the soc svn repo, to merge into a single App Engine
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    21
    """
03e267d67478 Major reorganization of the soc svn repo, to merge into a single App Engine
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    22
    def __str__(self):
03e267d67478 Major reorganization of the soc svn repo, to merge into a single App Engine
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    23
        return self.__unicode__().encode('utf-8')
03e267d67478 Major reorganization of the soc svn repo, to merge into a single App Engine
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    24
03e267d67478 Major reorganization of the soc svn repo, to merge into a single App Engine
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    25
def smart_unicode(s, encoding='utf-8', strings_only=False, errors='strict'):
03e267d67478 Major reorganization of the soc svn repo, to merge into a single App Engine
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    26
    """
03e267d67478 Major reorganization of the soc svn repo, to merge into a single App Engine
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    27
    Returns a unicode object representing 's'. Treats bytestrings using the
03e267d67478 Major reorganization of the soc svn repo, to merge into a single App Engine
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    28
    'encoding' codec.
03e267d67478 Major reorganization of the soc svn repo, to merge into a single App Engine
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    29
03e267d67478 Major reorganization of the soc svn repo, to merge into a single App Engine
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    30
    If strings_only is True, don't convert (some) non-string-like objects.
03e267d67478 Major reorganization of the soc svn repo, to merge into a single App Engine
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    31
    """
03e267d67478 Major reorganization of the soc svn repo, to merge into a single App Engine
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    32
    if isinstance(s, Promise):
03e267d67478 Major reorganization of the soc svn repo, to merge into a single App Engine
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    33
        # The input is the result of a gettext_lazy() call.
03e267d67478 Major reorganization of the soc svn repo, to merge into a single App Engine
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    34
        return s
03e267d67478 Major reorganization of the soc svn repo, to merge into a single App Engine
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    35
    return force_unicode(s, encoding, strings_only, errors)
03e267d67478 Major reorganization of the soc svn repo, to merge into a single App Engine
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    36
03e267d67478 Major reorganization of the soc svn repo, to merge into a single App Engine
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    37
def force_unicode(s, encoding='utf-8', strings_only=False, errors='strict'):
03e267d67478 Major reorganization of the soc svn repo, to merge into a single App Engine
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    38
    """
03e267d67478 Major reorganization of the soc svn repo, to merge into a single App Engine
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    39
    Similar to smart_unicode, except that lazy instances are resolved to
03e267d67478 Major reorganization of the soc svn repo, to merge into a single App Engine
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    40
    strings, rather than kept as lazy objects.
03e267d67478 Major reorganization of the soc svn repo, to merge into a single App Engine
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    41
03e267d67478 Major reorganization of the soc svn repo, to merge into a single App Engine
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    42
    If strings_only is True, don't convert (some) non-string-like objects.
03e267d67478 Major reorganization of the soc svn repo, to merge into a single App Engine
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    43
    """
03e267d67478 Major reorganization of the soc svn repo, to merge into a single App Engine
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    44
    if strings_only and isinstance(s, (types.NoneType, int, long, datetime.datetime, datetime.date, datetime.time, float)):
03e267d67478 Major reorganization of the soc svn repo, to merge into a single App Engine
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    45
        return s
03e267d67478 Major reorganization of the soc svn repo, to merge into a single App Engine
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    46
    try:
03e267d67478 Major reorganization of the soc svn repo, to merge into a single App Engine
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    47
        if not isinstance(s, basestring,):
03e267d67478 Major reorganization of the soc svn repo, to merge into a single App Engine
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    48
            if hasattr(s, '__unicode__'):
03e267d67478 Major reorganization of the soc svn repo, to merge into a single App Engine
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    49
                s = unicode(s)
03e267d67478 Major reorganization of the soc svn repo, to merge into a single App Engine
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    50
            else:
323
ff1a9aa48cfd Load ../vendor/django into trunk/app/django.
Pawel Solyga <Pawel.Solyga@gmail.com>
parents: 54
diff changeset
    51
                try:
ff1a9aa48cfd Load ../vendor/django into trunk/app/django.
Pawel Solyga <Pawel.Solyga@gmail.com>
parents: 54
diff changeset
    52
                    s = unicode(str(s), encoding, errors)
ff1a9aa48cfd Load ../vendor/django into trunk/app/django.
Pawel Solyga <Pawel.Solyga@gmail.com>
parents: 54
diff changeset
    53
                except UnicodeEncodeError:
ff1a9aa48cfd Load ../vendor/django into trunk/app/django.
Pawel Solyga <Pawel.Solyga@gmail.com>
parents: 54
diff changeset
    54
                    if not isinstance(s, Exception):
ff1a9aa48cfd Load ../vendor/django into trunk/app/django.
Pawel Solyga <Pawel.Solyga@gmail.com>
parents: 54
diff changeset
    55
                        raise
ff1a9aa48cfd Load ../vendor/django into trunk/app/django.
Pawel Solyga <Pawel.Solyga@gmail.com>
parents: 54
diff changeset
    56
                    # If we get to here, the caller has passed in an Exception
ff1a9aa48cfd Load ../vendor/django into trunk/app/django.
Pawel Solyga <Pawel.Solyga@gmail.com>
parents: 54
diff changeset
    57
                    # subclass populated with non-ASCII data without special
ff1a9aa48cfd Load ../vendor/django into trunk/app/django.
Pawel Solyga <Pawel.Solyga@gmail.com>
parents: 54
diff changeset
    58
                    # handling to display as a string. We need to handle this
ff1a9aa48cfd Load ../vendor/django into trunk/app/django.
Pawel Solyga <Pawel.Solyga@gmail.com>
parents: 54
diff changeset
    59
                    # without raising a further exception. We do an
ff1a9aa48cfd Load ../vendor/django into trunk/app/django.
Pawel Solyga <Pawel.Solyga@gmail.com>
parents: 54
diff changeset
    60
                    # approximation to what the Exception's standard str()
ff1a9aa48cfd Load ../vendor/django into trunk/app/django.
Pawel Solyga <Pawel.Solyga@gmail.com>
parents: 54
diff changeset
    61
                    # output should be.
ff1a9aa48cfd Load ../vendor/django into trunk/app/django.
Pawel Solyga <Pawel.Solyga@gmail.com>
parents: 54
diff changeset
    62
                    s = ' '.join([force_unicode(arg, encoding, strings_only,
ff1a9aa48cfd Load ../vendor/django into trunk/app/django.
Pawel Solyga <Pawel.Solyga@gmail.com>
parents: 54
diff changeset
    63
                            errors) for arg in s])
54
03e267d67478 Major reorganization of the soc svn repo, to merge into a single App Engine
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    64
        elif not isinstance(s, unicode):
03e267d67478 Major reorganization of the soc svn repo, to merge into a single App Engine
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    65
            # Note: We use .decode() here, instead of unicode(s, encoding,
03e267d67478 Major reorganization of the soc svn repo, to merge into a single App Engine
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    66
            # errors), so that if s is a SafeString, it ends up being a
03e267d67478 Major reorganization of the soc svn repo, to merge into a single App Engine
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    67
            # SafeUnicode at the end.
03e267d67478 Major reorganization of the soc svn repo, to merge into a single App Engine
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    68
            s = s.decode(encoding, errors)
03e267d67478 Major reorganization of the soc svn repo, to merge into a single App Engine
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    69
    except UnicodeDecodeError, e:
03e267d67478 Major reorganization of the soc svn repo, to merge into a single App Engine
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    70
        raise DjangoUnicodeDecodeError(s, *e.args)
03e267d67478 Major reorganization of the soc svn repo, to merge into a single App Engine
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    71
    return s
03e267d67478 Major reorganization of the soc svn repo, to merge into a single App Engine
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    72
03e267d67478 Major reorganization of the soc svn repo, to merge into a single App Engine
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    73
def smart_str(s, encoding='utf-8', strings_only=False, errors='strict'):
03e267d67478 Major reorganization of the soc svn repo, to merge into a single App Engine
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    74
    """
03e267d67478 Major reorganization of the soc svn repo, to merge into a single App Engine
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    75
    Returns a bytestring version of 's', encoded as specified in 'encoding'.
03e267d67478 Major reorganization of the soc svn repo, to merge into a single App Engine
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    76
03e267d67478 Major reorganization of the soc svn repo, to merge into a single App Engine
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    77
    If strings_only is True, don't convert (some) non-string-like objects.
03e267d67478 Major reorganization of the soc svn repo, to merge into a single App Engine
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    78
    """
03e267d67478 Major reorganization of the soc svn repo, to merge into a single App Engine
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    79
    if strings_only and isinstance(s, (types.NoneType, int)):
03e267d67478 Major reorganization of the soc svn repo, to merge into a single App Engine
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    80
        return s
03e267d67478 Major reorganization of the soc svn repo, to merge into a single App Engine
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    81
    if isinstance(s, Promise):
03e267d67478 Major reorganization of the soc svn repo, to merge into a single App Engine
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    82
        return unicode(s).encode(encoding, errors)
03e267d67478 Major reorganization of the soc svn repo, to merge into a single App Engine
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    83
    elif not isinstance(s, basestring):
03e267d67478 Major reorganization of the soc svn repo, to merge into a single App Engine
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    84
        try:
03e267d67478 Major reorganization of the soc svn repo, to merge into a single App Engine
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    85
            return str(s)
03e267d67478 Major reorganization of the soc svn repo, to merge into a single App Engine
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    86
        except UnicodeEncodeError:
323
ff1a9aa48cfd Load ../vendor/django into trunk/app/django.
Pawel Solyga <Pawel.Solyga@gmail.com>
parents: 54
diff changeset
    87
            if isinstance(s, Exception):
ff1a9aa48cfd Load ../vendor/django into trunk/app/django.
Pawel Solyga <Pawel.Solyga@gmail.com>
parents: 54
diff changeset
    88
                # An Exception subclass containing non-ASCII data that doesn't
ff1a9aa48cfd Load ../vendor/django into trunk/app/django.
Pawel Solyga <Pawel.Solyga@gmail.com>
parents: 54
diff changeset
    89
                # know how to print itself properly. We shouldn't raise a
ff1a9aa48cfd Load ../vendor/django into trunk/app/django.
Pawel Solyga <Pawel.Solyga@gmail.com>
parents: 54
diff changeset
    90
                # further exception.
ff1a9aa48cfd Load ../vendor/django into trunk/app/django.
Pawel Solyga <Pawel.Solyga@gmail.com>
parents: 54
diff changeset
    91
                return ' '.join([smart_str(arg, encoding, strings_only,
ff1a9aa48cfd Load ../vendor/django into trunk/app/django.
Pawel Solyga <Pawel.Solyga@gmail.com>
parents: 54
diff changeset
    92
                        errors) for arg in s])
54
03e267d67478 Major reorganization of the soc svn repo, to merge into a single App Engine
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    93
            return unicode(s).encode(encoding, errors)
03e267d67478 Major reorganization of the soc svn repo, to merge into a single App Engine
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    94
    elif isinstance(s, unicode):
03e267d67478 Major reorganization of the soc svn repo, to merge into a single App Engine
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    95
        return s.encode(encoding, errors)
03e267d67478 Major reorganization of the soc svn repo, to merge into a single App Engine
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    96
    elif s and encoding != 'utf-8':
03e267d67478 Major reorganization of the soc svn repo, to merge into a single App Engine
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    97
        return s.decode('utf-8', errors).encode(encoding, errors)
03e267d67478 Major reorganization of the soc svn repo, to merge into a single App Engine
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    98
    else:
03e267d67478 Major reorganization of the soc svn repo, to merge into a single App Engine
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    99
        return s
03e267d67478 Major reorganization of the soc svn repo, to merge into a single App Engine
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   100
03e267d67478 Major reorganization of the soc svn repo, to merge into a single App Engine
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   101
def iri_to_uri(iri):
03e267d67478 Major reorganization of the soc svn repo, to merge into a single App Engine
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   102
    """
03e267d67478 Major reorganization of the soc svn repo, to merge into a single App Engine
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   103
    Convert an Internationalized Resource Identifier (IRI) portion to a URI
03e267d67478 Major reorganization of the soc svn repo, to merge into a single App Engine
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   104
    portion that is suitable for inclusion in a URL.
03e267d67478 Major reorganization of the soc svn repo, to merge into a single App Engine
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   105
03e267d67478 Major reorganization of the soc svn repo, to merge into a single App Engine
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   106
    This is the algorithm from section 3.1 of RFC 3987.  However, since we are
03e267d67478 Major reorganization of the soc svn repo, to merge into a single App Engine
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   107
    assuming input is either UTF-8 or unicode already, we can simplify things a
03e267d67478 Major reorganization of the soc svn repo, to merge into a single App Engine
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   108
    little from the full method.
03e267d67478 Major reorganization of the soc svn repo, to merge into a single App Engine
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   109
03e267d67478 Major reorganization of the soc svn repo, to merge into a single App Engine
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   110
    Returns an ASCII string containing the encoded result.
03e267d67478 Major reorganization of the soc svn repo, to merge into a single App Engine
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   111
    """
03e267d67478 Major reorganization of the soc svn repo, to merge into a single App Engine
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   112
    # The list of safe characters here is constructed from the printable ASCII
03e267d67478 Major reorganization of the soc svn repo, to merge into a single App Engine
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   113
    # characters that are not explicitly excluded by the list at the end of
03e267d67478 Major reorganization of the soc svn repo, to merge into a single App Engine
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   114
    # section 3.1 of RFC 3987.
03e267d67478 Major reorganization of the soc svn repo, to merge into a single App Engine
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   115
    if iri is None:
03e267d67478 Major reorganization of the soc svn repo, to merge into a single App Engine
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   116
        return iri
03e267d67478 Major reorganization of the soc svn repo, to merge into a single App Engine
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   117
    return urllib.quote(smart_str(iri), safe='/#%[]=:;$&()+,!?*')
03e267d67478 Major reorganization of the soc svn repo, to merge into a single App Engine
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   118