scripts/munge.py
author David Anderson <david.jc.anderson@gmail.com>
Fri, 13 Mar 2009 02:56:35 +0000
changeset 1826 12de6d73a908
parent 541 d572b0fb6bfe
permissions -rwxr-xr-x
Followup to r2496. Fix obvious errors.
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
480
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
     1
#!/usr/bin/python2.5
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
     2
#
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
     3
# Copyright 2008 the Melange authors.
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
     4
#
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
     5
# Licensed under the Apache License, Version 2.0 (the "License");
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
     6
# you may not use this file except in compliance with the License.
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
     7
# You may obtain a copy of the License at
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
     8
#
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
     9
#   http://www.apache.org/licenses/LICENSE-2.0
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    10
#
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    11
# Unless required by applicable law or agreed to in writing, software
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    12
# distributed under the License is distributed on an "AS IS" BASIS,
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    13
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    14
# See the License for the specific language governing permissions and
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    15
# limitations under the License.
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    16
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    17
# __doc__ string is slightly unconventional because it is used as usage text
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    18
"""%prog [OPTIONS] [FIND_REGEX] [REPLACE_FORMAT]
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    19
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    20
Script to list, search, and modify files using Python regex patterns.
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    21
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    22
OPTIONS:  optional command-line flags; see %prog --help
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    23
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    24
FIND_REGEX:  an optional valid Python regular expression pattern;
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    25
  if supplied, only files containing at least one match will be processed;
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    26
  matching file paths will be printed; if supplied, REPLACE_FORMAT will be
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    27
  used to convert the match groups into formatted output.
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    28
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    29
REPLACE_FORMAT:  an optional valid Python format string;
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    30
  FIND_REGEX must be supplied first if REPLACE_FORMAT is supplied;
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    31
  positional arguments will be replaced with ordered groups from
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    32
  FIND_REGEX matches, and named arguments will be replaced with named
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    33
  groups from FIND_REGEX matches."""
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    34
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    35
__authors__ = [
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    36
  '"Todd Larsen" <tlarsen@google.com>',
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    37
]
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    38
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    39
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    40
import dircache
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    41
import errno
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    42
import os
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    43
import optparse
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    44
import re
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    45
import sre_constants
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    46
import sys
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    47
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    48
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    49
class Error(Exception):
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    50
  """Base class of all exceptions in this module.
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    51
  """
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    52
  pass
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    53
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    54
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    55
def compileRegex(pattern):
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    56
  """Compiles a Python regex pattern into a regex object.
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    57
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    58
  Args:
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    59
    pattern: valid Python regex pattern string, or an already-compiled
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    60
      regex object (in which case this function is is a no-op)
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    61
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    62
  Returns:
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    63
    regex object compiled from pattern
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    64
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    65
  Raises:
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    66
    Error if pattern could not be compiled.
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    67
  """
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    68
  try:
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    69
    return re.compile(pattern)
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    70
  except sre_constants.error, error:
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    71
    msg = 're.compile: %s\n%s' % (error.args[0], pattern)
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    72
    raise Error(errno.EINVAL, msg)
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    73
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    74
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    75
def findAll(text_to_search, pattern):
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    76
  """Returns all matches of a regex in a string.
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    77
  
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    78
  Args:
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    79
    text_to_search: string in which to find matches
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    80
    pattern: Python regex pattern (or already-compiled regex object)
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    81
      indicating which matches to retrieve
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    82
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    83
  Returns:
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    84
    a (possibly empty) list of the matches found, as strings 
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    85
  """
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    86
  matches = []
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    87
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    88
  def _captureMatchText(match):
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    89
    match_text = match.group()
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    90
    matches.append(match_text)
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    91
    return match_text
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    92
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    93
  compileRegex(pattern).sub(_captureMatchText, text_to_search)
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    94
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    95
  return matches 
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    96
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    97
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    98
def getFileContents(file_path):
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
    99
  """Reads the contents of a file as a single string, then closes the file.
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   100
  
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   101
  Args:
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   102
    file_path: path to the file to read its contents into a string
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   103
    
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   104
  Returns:
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   105
    a single string containing the entire contents of the file
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   106
  """
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   107
  file_to_read = open(file_path)
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   108
  file_contents = file_to_read.read()
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   109
  file_to_read.close()
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   110
  return file_contents
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   111
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   112
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   113
def findAllInFile(file_path, pattern, *ignored_args, **ignored_kwargs): 
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   114
  """Action to return a list of all pattern matches in a file.
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   115
  
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   116
  Args:
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   117
    file_path: path of file to manipulate
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   118
    pattern: see findAll()
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   119
    *ignored_args: other positional arguments which are ignored
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   120
      command-line arguments not used by this action callable
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   121
    **ignored_kwargs: other keyword arguments which are ignored
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   122
      command-line options not used by this action callable
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   123
    
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   124
  Returns:
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   125
    two-tuple of boolean indicating if any match was found and a
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   126
    (possibly empty) list of the matches found, as strings (to be used
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   127
    as printable output of the action)
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   128
  """
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   129
  matches = findAll(getFileContents(file_path), pattern)
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   130
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   131
  if matches:
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   132
    found = True
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   133
  else:
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   134
    found = False
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   135
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   136
  return found, matches
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   137
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   138
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   139
def replaceAll(original, pattern, format):
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   140
  """Substitutes formatted text for all matches in a string.
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   141
  
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   142
  Args:
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   143
    original: original string in which to find and replace matches
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   144
    pattern: Python regex pattern (or already-compiled regex object)
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   145
      indicating which matches to replace
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   146
    format: Python format string specifying how to format the
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   147
      replacement text; how this format string is interpreted depends
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   148
      on the contents of the pattern;  if the pattern contains:
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   149
        named groups: format is expected to contain named format specifiers
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   150
        unnamed groups: format is expected to contain exactly the same
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   151
          number of unnamed format specifiers as the number of groups in
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   152
          pattern
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   153
        no groups: format is expected to contain a single format specifier
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   154
          (in which case the entire match is supplied to it), or no format
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   155
          specifier at all (in which case the "format" string simply
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   156
          replaces the match with no substitutions from the match itself)
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   157
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   158
  Returns:
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   159
    two-tuple of the text with all matches replaced as specified by
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   160
    pattern and format, and a list of the original matches, each followed
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   161
    by its replacement 
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   162
  """
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   163
  matches_and_replacements = []
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   164
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   165
  def _replaceWithFormat(match):
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   166
    formatted_match = None
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   167
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   168
    if match.groupdict():
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   169
      try:
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   170
        formatted_match = format % match.groupdict()
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   171
      except TypeError:
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   172
        pass
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   173
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   174
    if (not formatted_match) and match.groups():
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   175
      try:
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   176
        formatted_match = format % match.groups()
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   177
      except TypeError:
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   178
        pass
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   179
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   180
    if (not formatted_match):
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   181
      try:
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   182
        formatted_match = format % match.group()
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   183
      except TypeError:
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   184
        formatted_match = format
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   185
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   186
    matches_and_replacements.append(match.group())
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   187
    matches_and_replacements.append(formatted_match)
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   188
    return formatted_match
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   189
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   190
  replaced = compileRegex(pattern).sub(_replaceWithFormat, original)
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   191
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   192
  return replaced, matches_and_replacements
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   193
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   194
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   195
def writeAltFileIfExt(path, ext, contents):
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   196
  """Writes a file if path and additional extension are supplied.
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   197
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   198
  If path or ext are not supplied, no file is written.
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   199
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   200
  Args:
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   201
    path: path of file to be written, to which ext will be appended
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   202
    ext: additional file extension that will be appended to path
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   203
    contents: contents of file to be written, as a string
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   204
  """
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   205
  if (not path) or (not ext):
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   206
    return
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   207
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   208
  if ext.startswith('.'):
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   209
    ext = ext[1:]
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   210
 
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   211
  alt_path = '%s.%s' % (path, ext)
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   212
  alt_file = open(alt_path, 'w')
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   213
  alt_file.write(contents) 
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   214
  alt_file.close()
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   215
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   216
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   217
def replaceAllInFile(file_path, pattern, format,
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   218
                     new_ext=None, backup_ext=None,
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   219
                     overwrite_files=False,
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   220
                     *ignored_args, **ignored_kwargs): 
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   221
  """Substitutes formatted text for all matches in a file.
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   222
  
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   223
  Args:
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   224
    file_path: path of file to manipulate
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   225
    pattern, format: see replaceAll()
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   226
    *ignored_args: other positional arguments which are ignored
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   227
      command-line arguments not used by this action callable
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   228
    **ignored_kwargs: other keyword arguments which are ignored
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   229
      command-line options not used by this action callable
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   230
    
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   231
  Returns:
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   232
    two-tuple of boolean indicating if any match was found and a
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   233
    list of printable output text lines containing pairs of original
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   234
    pattern matches each followed by the formatted replacement
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   235
  """
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   236
  original = getFileContents(file_path)
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   237
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   238
  replaced, matches_and_replacements = replaceAll(
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   239
    original, pattern, format)
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   240
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   241
  if matches_and_replacements:
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   242
    found = True
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   243
    writeAltFileIfExt(file_path, new_ext, replaced)
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   244
    writeAltFileIfExt(file_path, backup_ext, original)
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   245
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   246
    if overwrite_files:
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   247
      if replaced != original:
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   248
        replaced_file = open(file_path, 'w')
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   249
        replaced_file.write(replaced)
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   250
        replaced_file.close()
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   251
  else:
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   252
    found = False
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   253
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   254
  return found, matches_and_replacements
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   255
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   256
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   257
def listFile(*ignored_args, **ignored_kwargs): 
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   258
  """No-op action callable that ignores arguments and returns (True, []).
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   259
  """
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   260
  return True, []  # match only based on file names, which was done by caller
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   261
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   262
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   263
def applyActionToFiles(action, action_args,
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   264
                       start_path='', abs_path=False, files_pattern='',
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   265
                       recurse_dirs=False, dirs_pattern='',
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   266
                       follow_symlinks=False, quiet_output=False,
506
deaf548efde3 Fix bug where script fails when it encounters a socket (which is not a regular
Todd Larsen <tlarsen@google.com>
parents: 480
diff changeset
   267
                       hide_paths=False, hide_text=False, **action_options):
480
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   268
  """Applies a callable action to files, based on options and arguments.
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   269
  
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   270
  Args:
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   271
    action: callable that expects a file path argument, positional arguments
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   272
      (action_args), and keyword options from the command-line options dict;
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   273
      and returns a "matched" boolean and a list of output strings
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   274
    action_args: list of positional arguments, if any; passed to action
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   275
      callable unchanged
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   276
    start_path: required path of initial directory to visit
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   277
    abs_path: optional boolean indicating to use absolute paths
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   278
    files_pattern: required Python regex (object or pattern) which selects
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   279
      which files to pass to the action callable
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   280
    recurse_dirs: boolean indicating if subdirectories should be traversed 
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   281
    dirs_pattern: Python regex (object or pattern) which selects which
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   282
      subdirectories to traverse if recurse_dirs is True
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   283
    follow_symlinks: boolean indicating if symlinks should be traversed
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   284
    quiet_output: optional boolean indicating if output should be suppressed
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   285
    hide_paths: optional boolean indicating to omit file paths from output
506
deaf548efde3 Fix bug where script fails when it encounters a socket (which is not a regular
Todd Larsen <tlarsen@google.com>
parents: 480
diff changeset
   286
    hide_text: optional boolean indicating to omit find/replace text from
deaf548efde3 Fix bug where script fails when it encounters a socket (which is not a regular
Todd Larsen <tlarsen@google.com>
parents: 480
diff changeset
   287
      output
480
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   288
    **action_options: remaining keyword arguments that are passed unchanged
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   289
      to the action callable
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   290
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   291
  Returns:
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   292
    two-tuple containing an exit code and a (possibly empty) list of
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   293
    output strings
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   294
    
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   295
  Raises:
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   296
    Error exception if problems occur (file I/O, invalid regex, etc.).
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   297
  """
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   298
  exit_code = errno.ENOENT
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   299
  output = []
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   300
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   301
  start_path = os.path.expandvars(os.path.expanduser(start_path))
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   302
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   303
  if abs_path:
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   304
    start_path = os.path.abspath(start_path)
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   305
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   306
  paths = [start_path]
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   307
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   308
  files_regex = compileRegex(files_pattern)
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   309
  
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   310
  if recurse_dirs:
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   311
    dirs_regex = compileRegex(dirs_pattern)
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   312
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   313
  while paths:
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   314
    sub_paths = []
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   315
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   316
    for path in paths:
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   317
      # expand iterator into an actual list and sort it
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   318
      try:
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   319
        items = dircache.listdir(path)[:]
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   320
      except (IOError, OSError), error:
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   321
        raise Error(error.args[0], '%s: %s' % (
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   322
                    error.__class__.__name__, error.args[1]))
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   323
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   324
      items.sort()
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   325
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   326
      for item in items:
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   327
        item_path = os.path.join(path, item)
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   328
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   329
        if os.path.islink(item_path):
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   330
          if not follow_symlinks:
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   331
            continue  # do not follow symlinks (ignore them)
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   332
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   333
        if os.path.isdir(item_path):
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   334
          if recurse_dirs:
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   335
            if dirs_regex.match(item):
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   336
              sub_paths.append(item_path)
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   337
          continue
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   338
      
506
deaf548efde3 Fix bug where script fails when it encounters a socket (which is not a regular
Todd Larsen <tlarsen@google.com>
parents: 480
diff changeset
   339
        if os.path.isfile(item_path) and files_regex.match(item):
480
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   340
          try:
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   341
            matched, found_output = action(item_path, *action_args,
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   342
                                           **action_options)
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   343
          except (IOError, OSError), error:
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   344
            raise Error(error.args[0], '%s: %s' % (
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   345
                        error.__class__.__name__, error.args[1]))
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   346
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   347
          if matched:
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   348
            exit_code = 0  # at least one matched file has now been found
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   349
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   350
            if (not quiet_output) and (not hide_paths):
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   351
              output.append(item_path)
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   352
506
deaf548efde3 Fix bug where script fails when it encounters a socket (which is not a regular
Todd Larsen <tlarsen@google.com>
parents: 480
diff changeset
   353
          if (not quiet_output) and (not hide_text):
480
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   354
            output.extend(found_output)
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   355
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   356
    paths = sub_paths
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   357
  
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   358
  return exit_code, output
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   359
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   360
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   361
class _ErrorOptionParser(optparse.OptionParser):
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   362
  """Customized optparse.OptionParser that does not call sys.exit().
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   363
  """
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   364
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   365
  def error(self, msg):
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   366
    """Raises an Error exception, instead of calling sys.exit().
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   367
    """
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   368
    raise Error(errno.EINVAL, msg)
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   369
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   370
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   371
def _buildParser():
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   372
  """Returns a custom OptionParser for parsing command-line arguments.
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   373
  """
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   374
  parser = _ErrorOptionParser(__doc__)
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   375
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   376
  filter_group = optparse.OptionGroup(parser,
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   377
    'File Options',
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   378
    'Options used to select which files to process.')
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   379
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   380
  filter_group.add_option(
541
d572b0fb6bfe By default, exclude files that are likely to be binary files.
Todd Larsen <tlarsen@google.com>
parents: 511
diff changeset
   381
    '-f', '--files', dest='files_pattern',
d572b0fb6bfe By default, exclude files that are likely to be binary files.
Todd Larsen <tlarsen@google.com>
parents: 511
diff changeset
   382
    default='(?!^.*\.pyc|.*\.ico|.*\.gif|.*\.png|.*\.jpg$)',
480
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   383
    metavar='FILES_REGEX',
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   384
    help=('Python regex pattern (*not* a glob!) defining files to process'
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   385
          ' in each directory [default: %default]'))
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   386
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   387
  filter_group.add_option(
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   388
    '-F', '--follow', dest='follow_symlinks', default=False,
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   389
    action='store_true',
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   390
    help=('follow file and subdirectory symlinks (possibly *DANGEROUS*)'
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   391
          ' [default: %default]'))
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   392
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   393
  parser.add_option_group(filter_group)
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   394
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   395
  dir_group = optparse.OptionGroup(parser,
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   396
    'Directory Options',
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   397
    'Options used to indicate which directories to traverse.')
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   398
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   399
  dir_group.add_option(
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   400
    '-s', '--start', dest='start_path', default=os.curdir, metavar='PATH',
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   401
    help='directory in which to start processing files [default: %default]')
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   402
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   403
  dir_group.add_option(
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   404
    '-R', '--recursive', dest='recurse_dirs', default=False,
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   405
    action='store_true',
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   406
    help='recurse into subdirectories [default: %default]')
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   407
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   408
  dir_group.add_option(
511
52557918ec8f Ignore "dot" directories like .hg and .svn by default.
Todd Larsen <tlarsen@google.com>
parents: 506
diff changeset
   409
    '-d', '--dirs', dest='dirs_pattern', default='^[^.].*$',
480
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   410
    metavar='SUBDIRS_REGEX',
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   411
    help=('Python regex pattern (*not* a glob!) defining subdirectories to'
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   412
          ' recurse into (if --recursive) [default: %default]'))
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   413
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   414
  parser.add_option_group(dir_group)
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   415
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   416
  output_group = optparse.OptionGroup(parser,
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   417
    'Output Options',
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   418
    'Options used to control program output.')
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   419
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   420
  output_group.add_option(
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   421
    '-a', '--abspath', dest='abs_path', default=False, action='store_true',
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   422
    help=('output absolute paths instead of relative paths'
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   423
          ' [default: %default]'))
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   424
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   425
  output_group.add_option(
506
deaf548efde3 Fix bug where script fails when it encounters a socket (which is not a regular
Todd Larsen <tlarsen@google.com>
parents: 480
diff changeset
   426
    '', '--nopaths', dest='hide_paths', default=False, action='store_true',
480
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   427
    help=('suppress printing of file path names for successfully matched'
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   428
          ' files to stdout [default: %default]'))
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   429
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   430
  output_group.add_option(
506
deaf548efde3 Fix bug where script fails when it encounters a socket (which is not a regular
Todd Larsen <tlarsen@google.com>
parents: 480
diff changeset
   431
    '', '--notext', dest='hide_text', default=False, action='store_true',
deaf548efde3 Fix bug where script fails when it encounters a socket (which is not a regular
Todd Larsen <tlarsen@google.com>
parents: 480
diff changeset
   432
    help=('suppress find/replace text output to stdout (but still print'
deaf548efde3 Fix bug where script fails when it encounters a socket (which is not a regular
Todd Larsen <tlarsen@google.com>
parents: 480
diff changeset
   433
          ' paths if not --nopath, and still perform replacements if'
deaf548efde3 Fix bug where script fails when it encounters a socket (which is not a regular
Todd Larsen <tlarsen@google.com>
parents: 480
diff changeset
   434
          ' specified) [default: %default]'))
deaf548efde3 Fix bug where script fails when it encounters a socket (which is not a regular
Todd Larsen <tlarsen@google.com>
parents: 480
diff changeset
   435
deaf548efde3 Fix bug where script fails when it encounters a socket (which is not a regular
Todd Larsen <tlarsen@google.com>
parents: 480
diff changeset
   436
  output_group.add_option(
480
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   437
    '-q', '--quiet', dest='quiet_output', default=False, action='store_true',
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   438
    help=('suppress *all* printed output to stdout (but still perform'
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   439
          ' replacements if specified) [default: %default]'))
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   440
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   441
  parser.add_option_group(output_group)
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   442
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   443
  replace_group = optparse.OptionGroup(parser,
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   444
    'Replace Options',
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   445
    'Options applied when matches in files are replaced with substitutions.'
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   446
    ' (Only possible if REPLACE_FORMAT is supplied.)')
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   447
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   448
  replace_group.add_option(
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   449
    '-o', '--overwrite', dest='overwrite_files', default=False,
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   450
    action='store_true',
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   451
    help=('overwrite original files with formatted text substituted for'
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   452
          ' matches [default: %default]'))  
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   453
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   454
  replace_group.add_option(
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   455
    '-b', '--backup', dest='backup_ext', default='', metavar='EXTENSION',
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   456
    help=('if supplied, and file would be overwritten, backup original'
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   457
          ' file with the supplied extension [default is no backups of'
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   458
          ' overwritten files are kept]'))
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   459
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   460
  replace_group.add_option(
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   461
    '-n', '--new', dest='new_ext', default='', metavar='EXTENSION',
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   462
    help=('if supplied, and file has matches and and is altered by'
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   463
          ' substitutions, create a new file with the supplied extension'
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   464
          ' [default is no new file is created]'))
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   465
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   466
  parser.add_option_group(replace_group)
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   467
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   468
  return parser
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   469
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   470
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   471
def _parseArgs(cmd_line_args):
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   472
  """Builds a command-line option parser and parses command-line arguments.
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   473
  
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   474
  Args:
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   475
    cmd_line_args: command-line arguments, excluding the argv[0] program name
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   476
    
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   477
  Returns:
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   478
    four-tuple of action callable, supplied command-line options (including
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   479
    those defined by defaults in the command-line parser) as a dict,
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   480
    remaining positional command-line arguments, and the parser itself
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   481
    
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   482
  Raises:
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   483
    Error if problems occurred during commmand-line argument parsing.
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   484
  """
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   485
  parser = _buildParser()
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   486
  options, args = parser.parse_args(args=cmd_line_args)
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   487
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   488
  if not args:
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   489
    # no FIND_REGEX or REPLACE_PATTERN supplied, so just match based
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   490
    # on file name and subdirectory name patterns
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   491
    action = listFile
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   492
  elif len(args) == 1:
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   493
    # FIND_REGEX supplied, but not REPLACE_PATTERN, so just match based
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   494
    # on file name and subdirectory name patterns, and then on file
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   495
    # contents
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   496
    action = findAllInFile
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   497
  elif len(args) == 2:
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   498
    # FIND_REGEX and REPLACE_PATTERN both supplied, so match based
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   499
    # on file name and subdirectory name patterns, and then do a find and
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   500
    # replace on file contents
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   501
    action = replaceAllInFile
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   502
  else:
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   503
    raise Error(errno.EINVAL,'too many (%d) arguments supplied:\n%s' % (
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   504
                len(args), ' '.join(args)))
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   505
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   506
  return action, vars(options), args, parser
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   507
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   508
 
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   509
def _main(argv):
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   510
  """Wrapper that catches exceptions, prints output, and returns exit status.
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   511
  
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   512
  Normal program output is printed to stdout.  Error output (including
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   513
  exception text) is printed to stderr.
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   514
  
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   515
  Args:
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   516
    argv: script arguments, usually sys.argv; argv[0] is expected to be the
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   517
      program name
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   518
      
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   519
  Returns:
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   520
    exit code suitable for sys.exit()
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   521
  """
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   522
  options = {}  # empty options, used if _parseArgs() fails
506
deaf548efde3 Fix bug where script fails when it encounters a socket (which is not a regular
Todd Larsen <tlarsen@google.com>
parents: 480
diff changeset
   523
  parser = None
480
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   524
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   525
  try:
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   526
    action, options, args, parser = _parseArgs(argv[1:])
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   527
    exit_code, output = applyActionToFiles(action, args, **options)
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   528
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   529
    if output:  print '\n'.join(output)
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   530
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   531
  except Error, error:
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   532
    if not options.get('quiet_output'):
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   533
      print >>sys.stderr, '\nERROR: (%s: %s) %s\n' % (
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   534
        error.args[0], os.strerror(error.args[0]), error.args[1])
506
deaf548efde3 Fix bug where script fails when it encounters a socket (which is not a regular
Todd Larsen <tlarsen@google.com>
parents: 480
diff changeset
   535
deaf548efde3 Fix bug where script fails when it encounters a socket (which is not a regular
Todd Larsen <tlarsen@google.com>
parents: 480
diff changeset
   536
      if parser:
deaf548efde3 Fix bug where script fails when it encounters a socket (which is not a regular
Todd Larsen <tlarsen@google.com>
parents: 480
diff changeset
   537
        print >>sys.stderr, parser.get_usage()
480
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   538
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   539
    exit_code = error.args[0]
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   540
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   541
  return exit_code
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   542
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   543
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   544
if __name__ == '__main__':
9b07ddeb1412 For those times when sed isn't enough, but awk is too much, there's munge.py...
Todd Larsen <tlarsen@google.com>
parents:
diff changeset
   545
  sys.exit(_main(sys.argv))