app/python25src/urllib.py
author Lennard de Rijk <ljvderijk@gmail.com>
Fri, 03 Jul 2009 18:09:48 +0200
changeset 2509 3788c916776f
parent 280 ce9b10bbdd42
permissions -rw-r--r--
Corrected the links to Grading Project Surveys in Program menu.
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
280
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
     1
"""Open an arbitrary URL.
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
     2
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
     3
See the following document for more info on URLs:
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
     4
"Names and Addresses, URIs, URLs, URNs, URCs", at
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
     5
http://www.w3.org/pub/WWW/Addressing/Overview.html
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
     6
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
     7
See also the HTTP spec (from which the error codes are derived):
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
     8
"HTTP - Hypertext Transfer Protocol", at
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
     9
http://www.w3.org/pub/WWW/Protocols/
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    10
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    11
Related standards and specs:
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    12
- RFC1808: the "relative URL" spec. (authoritative status)
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    13
- RFC1738 - the "URL standard". (authoritative status)
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    14
- RFC1630 - the "URI spec". (informational status)
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    15
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    16
All code but that related to URL parsing has been removed (since it is not
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    17
compatible with Google App Engine)from this fork of the original file,
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    18
obtained from:
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    19
http://svn.python.org/view/*checkout*/python/tags/r252/Lib/urllib.py?content-type=text%2Fplain&rev=60915
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    20
"""
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    21
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    22
import string
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    23
import sys
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    24
from urlparse import urljoin as basejoin
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    25
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    26
__all__ = ["quote", "quote_plus", "unquote", "unquote_plus",
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    27
           "urlencode", "splittag",
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    28
           "basejoin", "unwrap",
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    29
           "splittype", "splithost", "splituser", "splitpasswd", "splitport",
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    30
           "splitnport", "splitquery", "splitattr", "splitvalue",
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    31
           "splitgophertype",]
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    32
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    33
__version__ = '1.17'    # XXX This version is not always updated :-(
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    34
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    35
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    36
# Utilities to parse URLs (most of these return None for missing parts):
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    37
# unwrap('<URL:type://host/path>') --> 'type://host/path'
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    38
# splittype('type:opaquestring') --> 'type', 'opaquestring'
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    39
# splithost('//host[:port]/path') --> 'host[:port]', '/path'
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    40
# splituser('user[:passwd]@host[:port]') --> 'user[:passwd]', 'host[:port]'
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    41
# splitpasswd('user:passwd') -> 'user', 'passwd'
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    42
# splitport('host:port') --> 'host', 'port'
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    43
# splitquery('/path?query') --> '/path', 'query'
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    44
# splittag('/path#tag') --> '/path', 'tag'
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    45
# splitattr('/path;attr1=value1;attr2=value2;...') ->
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    46
#   '/path', ['attr1=value1', 'attr2=value2', ...]
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    47
# splitvalue('attr=value') --> 'attr', 'value'
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    48
# splitgophertype('/Xselector') --> 'X', 'selector'
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    49
# unquote('abc%20def') -> 'abc def'
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    50
# quote('abc def') -> 'abc%20def')
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    51
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    52
try:
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    53
    unicode
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    54
except NameError:
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    55
    def _is_unicode(x):
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    56
        return 0
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    57
else:
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    58
    def _is_unicode(x):
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    59
        return isinstance(x, unicode)
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    60
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    61
def toBytes(url):
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    62
    """toBytes(u"URL") --> 'URL'."""
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    63
    # Most URL schemes require ASCII. If that changes, the conversion
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    64
    # can be relaxed
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    65
    if _is_unicode(url):
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    66
        try:
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    67
            url = url.encode("ASCII")
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    68
        except UnicodeError:
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    69
            raise UnicodeError("URL " + repr(url) +
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    70
                               " contains non-ASCII characters")
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    71
    return url
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    72
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    73
def unwrap(url):
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    74
    """unwrap('<URL:type://host/path>') --> 'type://host/path'."""
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    75
    url = url.strip()
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    76
    if url[:1] == '<' and url[-1:] == '>':
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    77
        url = url[1:-1].strip()
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    78
    if url[:4] == 'URL:': url = url[4:].strip()
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    79
    return url
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    80
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    81
_typeprog = None
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    82
def splittype(url):
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    83
    """splittype('type:opaquestring') --> 'type', 'opaquestring'."""
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    84
    global _typeprog
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    85
    if _typeprog is None:
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    86
        import re
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    87
        _typeprog = re.compile('^([^/:]+):')
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    88
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    89
    match = _typeprog.match(url)
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    90
    if match:
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    91
        scheme = match.group(1)
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    92
        return scheme.lower(), url[len(scheme) + 1:]
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    93
    return None, url
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    94
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    95
_hostprog = None
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    96
def splithost(url):
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    97
    """splithost('//host[:port]/path') --> 'host[:port]', '/path'."""
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    98
    global _hostprog
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    99
    if _hostprog is None:
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   100
        import re
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   101
        _hostprog = re.compile('^//([^/?]*)(.*)$')
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   102
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   103
    match = _hostprog.match(url)
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   104
    if match: return match.group(1, 2)
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   105
    return None, url
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   106
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   107
_userprog = None
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   108
def splituser(host):
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   109
    """splituser('user[:passwd]@host[:port]') --> 'user[:passwd]', 'host[:port]'."""
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   110
    global _userprog
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   111
    if _userprog is None:
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   112
        import re
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   113
        _userprog = re.compile('^(.*)@(.*)$')
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   114
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   115
    match = _userprog.match(host)
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   116
    if match: return map(unquote, match.group(1, 2))
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   117
    return None, host
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   118
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   119
_passwdprog = None
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   120
def splitpasswd(user):
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   121
    """splitpasswd('user:passwd') -> 'user', 'passwd'."""
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   122
    global _passwdprog
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   123
    if _passwdprog is None:
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   124
        import re
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   125
        _passwdprog = re.compile('^([^:]*):(.*)$')
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   126
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   127
    match = _passwdprog.match(user)
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   128
    if match: return match.group(1, 2)
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   129
    return user, None
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   130
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   131
# splittag('/path#tag') --> '/path', 'tag'
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   132
_portprog = None
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   133
def splitport(host):
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   134
    """splitport('host:port') --> 'host', 'port'."""
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   135
    global _portprog
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   136
    if _portprog is None:
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   137
        import re
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   138
        _portprog = re.compile('^(.*):([0-9]+)$')
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   139
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   140
    match = _portprog.match(host)
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   141
    if match: return match.group(1, 2)
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   142
    return host, None
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   143
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   144
_nportprog = None
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   145
def splitnport(host, defport=-1):
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   146
    """Split host and port, returning numeric port.
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   147
    Return given default port if no ':' found; defaults to -1.
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   148
    Return numerical port if a valid number are found after ':'.
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   149
    Return None if ':' but not a valid number."""
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   150
    global _nportprog
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   151
    if _nportprog is None:
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   152
        import re
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   153
        _nportprog = re.compile('^(.*):(.*)$')
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   154
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   155
    match = _nportprog.match(host)
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   156
    if match:
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   157
        host, port = match.group(1, 2)
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   158
        try:
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   159
            if not port: raise ValueError, "no digits"
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   160
            nport = int(port)
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   161
        except ValueError:
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   162
            nport = None
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   163
        return host, nport
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   164
    return host, defport
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   165
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   166
_queryprog = None
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   167
def splitquery(url):
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   168
    """splitquery('/path?query') --> '/path', 'query'."""
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   169
    global _queryprog
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   170
    if _queryprog is None:
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   171
        import re
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   172
        _queryprog = re.compile('^(.*)\?([^?]*)$')
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   173
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   174
    match = _queryprog.match(url)
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   175
    if match: return match.group(1, 2)
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   176
    return url, None
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   177
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   178
_tagprog = None
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   179
def splittag(url):
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   180
    """splittag('/path#tag') --> '/path', 'tag'."""
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   181
    global _tagprog
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   182
    if _tagprog is None:
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   183
        import re
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   184
        _tagprog = re.compile('^(.*)#([^#]*)$')
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   185
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   186
    match = _tagprog.match(url)
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   187
    if match: return match.group(1, 2)
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   188
    return url, None
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   189
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   190
def splitattr(url):
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   191
    """splitattr('/path;attr1=value1;attr2=value2;...') ->
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   192
        '/path', ['attr1=value1', 'attr2=value2', ...]."""
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   193
    words = url.split(';')
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   194
    return words[0], words[1:]
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   195
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   196
_valueprog = None
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   197
def splitvalue(attr):
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   198
    """splitvalue('attr=value') --> 'attr', 'value'."""
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   199
    global _valueprog
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   200
    if _valueprog is None:
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   201
        import re
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   202
        _valueprog = re.compile('^([^=]*)=(.*)$')
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   203
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   204
    match = _valueprog.match(attr)
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   205
    if match: return match.group(1, 2)
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   206
    return attr, None
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   207
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   208
def splitgophertype(selector):
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   209
    """splitgophertype('/Xselector') --> 'X', 'selector'."""
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   210
    if selector[:1] == '/' and selector[1:2]:
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   211
        return selector[1], selector[2:]
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   212
    return None, selector
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   213
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   214
_hextochr = dict(('%02x' % i, chr(i)) for i in range(256))
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   215
_hextochr.update(('%02X' % i, chr(i)) for i in range(256))
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   216
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   217
def unquote(s):
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   218
    """unquote('abc%20def') -> 'abc def'."""
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   219
    res = s.split('%')
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   220
    for i in xrange(1, len(res)):
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   221
        item = res[i]
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   222
        try:
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   223
            res[i] = _hextochr[item[:2]] + item[2:]
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   224
        except KeyError:
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   225
            res[i] = '%' + item
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   226
        except UnicodeDecodeError:
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   227
            res[i] = unichr(int(item[:2], 16)) + item[2:]
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   228
    return "".join(res)
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   229
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   230
def unquote_plus(s):
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   231
    """unquote('%7e/abc+def') -> '~/abc def'"""
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   232
    s = s.replace('+', ' ')
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   233
    return unquote(s)
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   234
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   235
always_safe = ('ABCDEFGHIJKLMNOPQRSTUVWXYZ'
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   236
               'abcdefghijklmnopqrstuvwxyz'
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   237
               '0123456789' '_.-')
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   238
_safemaps = {}
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   239
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   240
def quote(s, safe = '/'):
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   241
    """quote('abc def') -> 'abc%20def'
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   242
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   243
    Each part of a URL, e.g. the path info, the query, etc., has a
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   244
    different set of reserved characters that must be quoted.
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   245
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   246
    RFC 2396 Uniform Resource Identifiers (URI): Generic Syntax lists
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   247
    the following reserved characters.
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   248
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   249
    reserved    = ";" | "/" | "?" | ":" | "@" | "&" | "=" | "+" |
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   250
                  "$" | ","
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   251
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   252
    Each of these characters is reserved in some component of a URL,
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   253
    but not necessarily in all of them.
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   254
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   255
    By default, the quote function is intended for quoting the path
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   256
    section of a URL.  Thus, it will not encode '/'.  This character
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   257
    is reserved, but in typical usage the quote function is being
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   258
    called on a path where the existing slash characters are used as
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   259
    reserved characters.
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   260
    """
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   261
    cachekey = (safe, always_safe)
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   262
    try:
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   263
        safe_map = _safemaps[cachekey]
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   264
    except KeyError:
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   265
        safe += always_safe
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   266
        safe_map = {}
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   267
        for i in range(256):
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   268
            c = chr(i)
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   269
            safe_map[c] = (c in safe) and c or ('%%%02X' % i)
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   270
        _safemaps[cachekey] = safe_map
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   271
    res = map(safe_map.__getitem__, s)
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   272
    return ''.join(res)
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   273
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   274
def quote_plus(s, safe = ''):
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   275
    """Quote the query fragment of a URL; replacing ' ' with '+'"""
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   276
    if ' ' in s:
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   277
        s = quote(s, safe + ' ')
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   278
        return s.replace(' ', '+')
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   279
    return quote(s, safe)
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   280
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   281
def urlencode(query,doseq=0):
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   282
    """Encode a sequence of two-element tuples or dictionary into a URL query string.
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   283
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   284
    If any values in the query arg are sequences and doseq is true, each
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   285
    sequence element is converted to a separate parameter.
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   286
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   287
    If the query arg is a sequence of two-element tuples, the order of the
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   288
    parameters in the output will match the order of parameters in the
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   289
    input.
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   290
    """
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   291
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   292
    if hasattr(query,"items"):
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   293
        # mapping objects
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   294
        query = query.items()
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   295
    else:
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   296
        # it's a bother at times that strings and string-like objects are
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   297
        # sequences...
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   298
        try:
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   299
            # non-sequence items should not work with len()
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   300
            # non-empty strings will fail this
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   301
            if len(query) and not isinstance(query[0], tuple):
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   302
                raise TypeError
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   303
            # zero-length sequences of all types will get here and succeed,
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   304
            # but that's a minor nit - since the original implementation
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   305
            # allowed empty dicts that type of behavior probably should be
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   306
            # preserved for consistency
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   307
        except TypeError:
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   308
            ty,va,tb = sys.exc_info()
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   309
            raise TypeError, "not a valid non-string sequence or mapping object", tb
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   310
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   311
    l = []
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   312
    if not doseq:
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   313
        # preserve old behavior
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   314
        for k, v in query:
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   315
            k = quote_plus(str(k))
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   316
            v = quote_plus(str(v))
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   317
            l.append(k + '=' + v)
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   318
    else:
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   319
        for k, v in query:
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   320
            k = quote_plus(str(k))
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   321
            if isinstance(v, str):
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   322
                v = quote_plus(v)
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   323
                l.append(k + '=' + v)
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   324
            elif _is_unicode(v):
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   325
                # is there a reasonable way to convert to ASCII?
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   326
                # encode generates a string, but "replace" or "ignore"
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   327
                # lose information and "strict" can raise UnicodeError
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   328
                v = quote_plus(v.encode("ASCII","replace"))
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   329
                l.append(k + '=' + v)
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   330
            else:
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   331
                try:
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   332
                    # is this a sufficient test for sequence-ness?
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   333
                    x = len(v)
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   334
                except TypeError:
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   335
                    # not a sequence
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   336
                    v = quote_plus(str(v))
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   337
                    l.append(k + '=' + v)
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   338
                else:
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   339
                    # loop over the sequence
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   340
                    for elt in v:
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   341
                        l.append(k + '=' + quote_plus(str(elt)))
ce9b10bbdd42 urllib.quote() is needed by the soc/logic/site/map.py work, so import it from
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   342
    return '&'.join(l)