SEESenv/scripts/finalhtml.py
author amit@thunder
Wed, 10 Mar 2010 17:39:26 +0530
changeset 45 b5bff924ef69
parent 44 d0e9b52bda73
child 49 3b5f1341d6c6
permissions -rw-r--r--
Some more changes to soup is made in final html also comment.html has been changed so the links don't appear
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
26
1846ab4ebdda Bug fixes and added a script for changes in final html
amit@thunder
parents:
diff changeset
     1
import glob
39
bc65d8802897 Bug fix in myrst .. so that it does not fail at not finding a file
amit@thunder
parents: 32
diff changeset
     2
#import lxml
26
1846ab4ebdda Bug fixes and added a script for changes in final html
amit@thunder
parents:
diff changeset
     3
import re
1846ab4ebdda Bug fixes and added a script for changes in final html
amit@thunder
parents:
diff changeset
     4
import os
44
d0e9b52bda73 Changed the algorithm for getting the titles ... Also added the ability to navigate to the next chapters
amit@thunder
parents: 41
diff changeset
     5
from BeautifulSoup import BeautifulSoup ,NavigableString
26
1846ab4ebdda Bug fixes and added a script for changes in final html
amit@thunder
parents:
diff changeset
     6
import time
41
e54725be4df6 Changed paths dependent on repo location to be taken from the script also changed how the soup is printed
amit@thunder
parents: 40
diff changeset
     7
import sys
44
d0e9b52bda73 Changed the algorithm for getting the titles ... Also added the ability to navigate to the next chapters
amit@thunder
parents: 41
diff changeset
     8
import xml.etree.ElementTree as ET
d0e9b52bda73 Changed the algorithm for getting the titles ... Also added the ability to navigate to the next chapters
amit@thunder
parents: 41
diff changeset
     9
import xml
d0e9b52bda73 Changed the algorithm for getting the titles ... Also added the ability to navigate to the next chapters
amit@thunder
parents: 41
diff changeset
    10
41
e54725be4df6 Changed paths dependent on repo location to be taken from the script also changed how the soup is printed
amit@thunder
parents: 40
diff changeset
    11
repo='/home/hg/repos/SEES-hacks/temp/'
44
d0e9b52bda73 Changed the algorithm for getting the titles ... Also added the ability to navigate to the next chapters
amit@thunder
parents: 41
diff changeset
    12
#repo='/home/amit/testdocbook2/'
26
1846ab4ebdda Bug fixes and added a script for changes in final html
amit@thunder
parents:
diff changeset
    13
44
d0e9b52bda73 Changed the algorithm for getting the titles ... Also added the ability to navigate to the next chapters
amit@thunder
parents: 41
diff changeset
    14
def sort_doubledigit(chapter_names):
45
b5bff924ef69 Some more changes to soup is made in final html also comment.html has been changed so the links don't appear
amit@thunder
parents: 44
diff changeset
    15
    extend_list=[]
44
d0e9b52bda73 Changed the algorithm for getting the titles ... Also added the ability to navigate to the next chapters
amit@thunder
parents: 41
diff changeset
    16
    for item in chapter_names:
d0e9b52bda73 Changed the algorithm for getting the titles ... Also added the ability to navigate to the next chapters
amit@thunder
parents: 41
diff changeset
    17
        reg_obj=re.compile(os.path.join(repo,'ch1[0-9].*.html'))
d0e9b52bda73 Changed the algorithm for getting the titles ... Also added the ability to navigate to the next chapters
amit@thunder
parents: 41
diff changeset
    18
        if (reg_obj.match(item)):
45
b5bff924ef69 Some more changes to soup is made in final html also comment.html has been changed so the links don't appear
amit@thunder
parents: 44
diff changeset
    19
            item=re.sub('ch1','chn1',item)
b5bff924ef69 Some more changes to soup is made in final html also comment.html has been changed so the links don't appear
amit@thunder
parents: 44
diff changeset
    20
            extend_list.append(item)
b5bff924ef69 Some more changes to soup is made in final html also comment.html has been changed so the links don't appear
amit@thunder
parents: 44
diff changeset
    21
    chapter_names=chapter_names[len(extend_list):]    
b5bff924ef69 Some more changes to soup is made in final html also comment.html has been changed so the links don't appear
amit@thunder
parents: 44
diff changeset
    22
    chapter_names.extend(extend_list)
b5bff924ef69 Some more changes to soup is made in final html also comment.html has been changed so the links don't appear
amit@thunder
parents: 44
diff changeset
    23
44
d0e9b52bda73 Changed the algorithm for getting the titles ... Also added the ability to navigate to the next chapters
amit@thunder
parents: 41
diff changeset
    24
    return chapter_names
26
1846ab4ebdda Bug fixes and added a script for changes in final html
amit@thunder
parents:
diff changeset
    25
45
b5bff924ef69 Some more changes to soup is made in final html also comment.html has been changed so the links don't appear
amit@thunder
parents: 44
diff changeset
    26
b5bff924ef69 Some more changes to soup is made in final html also comment.html has been changed so the links don't appear
amit@thunder
parents: 44
diff changeset
    27
b5bff924ef69 Some more changes to soup is made in final html also comment.html has been changed so the links don't appear
amit@thunder
parents: 44
diff changeset
    28
b5bff924ef69 Some more changes to soup is made in final html also comment.html has been changed so the links don't appear
amit@thunder
parents: 44
diff changeset
    29
26
1846ab4ebdda Bug fixes and added a script for changes in final html
amit@thunder
parents:
diff changeset
    30
def finalchanges(file_name,html_string):
1846ab4ebdda Bug fixes and added a script for changes in final html
amit@thunder
parents:
diff changeset
    31
    """some of the final changes that need to do be done on the html before creating the final usable page in the hgbook project"""	    
1846ab4ebdda Bug fixes and added a script for changes in final html
amit@thunder
parents:
diff changeset
    32
#    print html_string    
1846ab4ebdda Bug fixes and added a script for changes in final html
amit@thunder
parents:
diff changeset
    33
    replace_string="""<head><meta http-equiv="Content-Type" content="text/html; charset=utf-8"><title>Chapter 2. Basic Python</title><link rel="stylesheet" href="/review/support/styles.css" type="text/css"><meta name="generator" content="DocBook XSL Stylesheets V1.74.3"><link rel="shortcut icon" type="image/png" href="/review/support/figs/favicon.png"><script type="text/javascript" src="/review/support/jquery-min.js"></script><script type="text/javascript" src="/review/support/form.js"></script><script type="text/javascript" src="/review/support/hsbook.js"></script></head>"""
44
d0e9b52bda73 Changed the algorithm for getting the titles ... Also added the ability to navigate to the next chapters
amit@thunder
parents: 41
diff changeset
    34
    ch_name=os.path.split(file_name)[1].split('.')[0]
d0e9b52bda73 Changed the algorithm for getting the titles ... Also added the ability to navigate to the next chapters
amit@thunder
parents: 41
diff changeset
    35
    chapter_names_unsorted=glob.glob(os.path.join(repo,'ch*.html'))
d0e9b52bda73 Changed the algorithm for getting the titles ... Also added the ability to navigate to the next chapters
amit@thunder
parents: 41
diff changeset
    36
    chapter_names_unsorted.sort()    
d0e9b52bda73 Changed the algorithm for getting the titles ... Also added the ability to navigate to the next chapters
amit@thunder
parents: 41
diff changeset
    37
    chapter_names_sorted=chapter_names_unsorted
d0e9b52bda73 Changed the algorithm for getting the titles ... Also added the ability to navigate to the next chapters
amit@thunder
parents: 41
diff changeset
    38
#    print chapter_names_sorted
d0e9b52bda73 Changed the algorithm for getting the titles ... Also added the ability to navigate to the next chapters
amit@thunder
parents: 41
diff changeset
    39
    chapter_names_sorted=sort_doubledigit(chapter_names_sorted)
d0e9b52bda73 Changed the algorithm for getting the titles ... Also added the ability to navigate to the next chapters
amit@thunder
parents: 41
diff changeset
    40
    chapter_names=chapter_names_sorted
d0e9b52bda73 Changed the algorithm for getting the titles ... Also added the ability to navigate to the next chapters
amit@thunder
parents: 41
diff changeset
    41
    previous_string='<<<'
d0e9b52bda73 Changed the algorithm for getting the titles ... Also added the ability to navigate to the next chapters
amit@thunder
parents: 41
diff changeset
    42
    next_string='>>>'
45
b5bff924ef69 Some more changes to soup is made in final html also comment.html has been changed so the links don't appear
amit@thunder
parents: 44
diff changeset
    43
    ch_name_tmp=file_name.split('.')[0]
b5bff924ef69 Some more changes to soup is made in final html also comment.html has been changed so the links don't appear
amit@thunder
parents: 44
diff changeset
    44
#    html_src_folder=""
b5bff924ef69 Some more changes to soup is made in final html also comment.html has been changed so the links don't appear
amit@thunder
parents: 44
diff changeset
    45
    
44
d0e9b52bda73 Changed the algorithm for getting the titles ... Also added the ability to navigate to the next chapters
amit@thunder
parents: 41
diff changeset
    46
45
b5bff924ef69 Some more changes to soup is made in final html also comment.html has been changed so the links don't appear
amit@thunder
parents: 44
diff changeset
    47
    try:
b5bff924ef69 Some more changes to soup is made in final html also comment.html has been changed so the links don't appear
amit@thunder
parents: 44
diff changeset
    48
    #Handling the problem of chapter names in two digits
b5bff924ef69 Some more changes to soup is made in final html also comment.html has been changed so the links don't appear
amit@thunder
parents: 44
diff changeset
    49
        current_chapter_index=chapter_names.index(file_name)
b5bff924ef69 Some more changes to soup is made in final html also comment.html has been changed so the links don't appear
amit@thunder
parents: 44
diff changeset
    50
    except :
b5bff924ef69 Some more changes to soup is made in final html also comment.html has been changed so the links don't appear
amit@thunder
parents: 44
diff changeset
    51
        temp_file_name=re.sub('ch1','chn1',file_name)
b5bff924ef69 Some more changes to soup is made in final html also comment.html has been changed so the links don't appear
amit@thunder
parents: 44
diff changeset
    52
        current_chapter_index=chapter_names.index(temp_file_name)
b5bff924ef69 Some more changes to soup is made in final html also comment.html has been changed so the links don't appear
amit@thunder
parents: 44
diff changeset
    53
b5bff924ef69 Some more changes to soup is made in final html also comment.html has been changed so the links don't appear
amit@thunder
parents: 44
diff changeset
    54
b5bff924ef69 Some more changes to soup is made in final html also comment.html has been changed so the links don't appear
amit@thunder
parents: 44
diff changeset
    55
b5bff924ef69 Some more changes to soup is made in final html also comment.html has been changed so the links don't appear
amit@thunder
parents: 44
diff changeset
    56
    current_chapter=chapter_names[current_chapter_index].split('/')[-1]
b5bff924ef69 Some more changes to soup is made in final html also comment.html has been changed so the links don't appear
amit@thunder
parents: 44
diff changeset
    57
    if (current_chapter_index-1>=0):
b5bff924ef69 Some more changes to soup is made in final html also comment.html has been changed so the links don't appear
amit@thunder
parents: 44
diff changeset
    58
        previous_chapter=chapter_names[current_chapter_index-1].split('/')[-1]
44
d0e9b52bda73 Changed the algorithm for getting the titles ... Also added the ability to navigate to the next chapters
amit@thunder
parents: 41
diff changeset
    59
    else:
d0e9b52bda73 Changed the algorithm for getting the titles ... Also added the ability to navigate to the next chapters
amit@thunder
parents: 41
diff changeset
    60
        previous_chapter=''
d0e9b52bda73 Changed the algorithm for getting the titles ... Also added the ability to navigate to the next chapters
amit@thunder
parents: 41
diff changeset
    61
        previous_string=''
d0e9b52bda73 Changed the algorithm for getting the titles ... Also added the ability to navigate to the next chapters
amit@thunder
parents: 41
diff changeset
    62
    try :  
45
b5bff924ef69 Some more changes to soup is made in final html also comment.html has been changed so the links don't appear
amit@thunder
parents: 44
diff changeset
    63
        next_chapter=chapter_names[current_chapter_index+1].split('/')[-1]
44
d0e9b52bda73 Changed the algorithm for getting the titles ... Also added the ability to navigate to the next chapters
amit@thunder
parents: 41
diff changeset
    64
    except:
d0e9b52bda73 Changed the algorithm for getting the titles ... Also added the ability to navigate to the next chapters
amit@thunder
parents: 41
diff changeset
    65
        next_string=''
d0e9b52bda73 Changed the algorithm for getting the titles ... Also added the ability to navigate to the next chapters
amit@thunder
parents: 41
diff changeset
    66
        next_chapter=''
d0e9b52bda73 Changed the algorithm for getting the titles ... Also added the ability to navigate to the next chapters
amit@thunder
parents: 41
diff changeset
    67
    
45
b5bff924ef69 Some more changes to soup is made in final html also comment.html has been changed so the links don't appear
amit@thunder
parents: 44
diff changeset
    68
    
44
d0e9b52bda73 Changed the algorithm for getting the titles ... Also added the ability to navigate to the next chapters
amit@thunder
parents: 41
diff changeset
    69
    chapter_xml=ch_name_tmp+'.xml'
d0e9b52bda73 Changed the algorithm for getting the titles ... Also added the ability to navigate to the next chapters
amit@thunder
parents: 41
diff changeset
    70
       
d0e9b52bda73 Changed the algorithm for getting the titles ... Also added the ability to navigate to the next chapters
amit@thunder
parents: 41
diff changeset
    71
d0e9b52bda73 Changed the algorithm for getting the titles ... Also added the ability to navigate to the next chapters
amit@thunder
parents: 41
diff changeset
    72
    try:    
d0e9b52bda73 Changed the algorithm for getting the titles ... Also added the ability to navigate to the next chapters
amit@thunder
parents: 41
diff changeset
    73
        xml_file =open(chapter_xml,'r').read()
d0e9b52bda73 Changed the algorithm for getting the titles ... Also added the ability to navigate to the next chapters
amit@thunder
parents: 41
diff changeset
    74
        xml_tree=ET.fromstring(xml_file)
d0e9b52bda73 Changed the algorithm for getting the titles ... Also added the ability to navigate to the next chapters
amit@thunder
parents: 41
diff changeset
    75
        try:
d0e9b52bda73 Changed the algorithm for getting the titles ... Also added the ability to navigate to the next chapters
amit@thunder
parents: 41
diff changeset
    76
            title_tag=xml_tree.find('title')
d0e9b52bda73 Changed the algorithm for getting the titles ... Also added the ability to navigate to the next chapters
amit@thunder
parents: 41
diff changeset
    77
            current_chapter_title=title_tag.text
d0e9b52bda73 Changed the algorithm for getting the titles ... Also added the ability to navigate to the next chapters
amit@thunder
parents: 41
diff changeset
    78
        except:
d0e9b52bda73 Changed the algorithm for getting the titles ... Also added the ability to navigate to the next chapters
amit@thunder
parents: 41
diff changeset
    79
            section=xml_tree.getchildren()[0]
d0e9b52bda73 Changed the algorithm for getting the titles ... Also added the ability to navigate to the next chapters
amit@thunder
parents: 41
diff changeset
    80
            title_tag=section.find('title')
d0e9b52bda73 Changed the algorithm for getting the titles ... Also added the ability to navigate to the next chapters
amit@thunder
parents: 41
diff changeset
    81
            current_chapter_title=title_tag.text
d0e9b52bda73 Changed the algorithm for getting the titles ... Also added the ability to navigate to the next chapters
amit@thunder
parents: 41
diff changeset
    82
d0e9b52bda73 Changed the algorithm for getting the titles ... Also added the ability to navigate to the next chapters
amit@thunder
parents: 41
diff changeset
    83
        print current_chapter_title
d0e9b52bda73 Changed the algorithm for getting the titles ... Also added the ability to navigate to the next chapters
amit@thunder
parents: 41
diff changeset
    84
        
d0e9b52bda73 Changed the algorithm for getting the titles ... Also added the ability to navigate to the next chapters
amit@thunder
parents: 41
diff changeset
    85
#        soup.html.body.insert(0,NavigableString(body_add_string))
d0e9b52bda73 Changed the algorithm for getting the titles ... Also added the ability to navigate to the next chapters
amit@thunder
parents: 41
diff changeset
    86
d0e9b52bda73 Changed the algorithm for getting the titles ... Also added the ability to navigate to the next chapters
amit@thunder
parents: 41
diff changeset
    87
d0e9b52bda73 Changed the algorithm for getting the titles ... Also added the ability to navigate to the next chapters
amit@thunder
parents: 41
diff changeset
    88
    except :
d0e9b52bda73 Changed the algorithm for getting the titles ... Also added the ability to navigate to the next chapters
amit@thunder
parents: 41
diff changeset
    89
        ch_title=re.split('[0-9]*',ch_name)[1]    
45
b5bff924ef69 Some more changes to soup is made in final html also comment.html has been changed so the links don't appear
amit@thunder
parents: 44
diff changeset
    90
        title_string=ch_title
44
d0e9b52bda73 Changed the algorithm for getting the titles ... Also added the ability to navigate to the next chapters
amit@thunder
parents: 41
diff changeset
    91
        current_chapter_title=title_string        
d0e9b52bda73 Changed the algorithm for getting the titles ... Also added the ability to navigate to the next chapters
amit@thunder
parents: 41
diff changeset
    92
    
d0e9b52bda73 Changed the algorithm for getting the titles ... Also added the ability to navigate to the next chapters
amit@thunder
parents: 41
diff changeset
    93
    
d0e9b52bda73 Changed the algorithm for getting the titles ... Also added the ability to navigate to the next chapters
amit@thunder
parents: 41
diff changeset
    94
d0e9b52bda73 Changed the algorithm for getting the titles ... Also added the ability to navigate to the next chapters
amit@thunder
parents: 41
diff changeset
    95
    body_add_string="""<div><table width="100%%" summary="Navigation header"><tr><th colspan="3" align="center">Chapter %s</th></tr><tr><td width="20%%" align="left"><a accesskey="p" href="%s">%s</a></td><th width="60%%" align="center"> </th><td width="20%%" align="right"> <a accesskey="n" href="%s">%s</a></td></tr></table></div>"""%(current_chapter_title,previous_chapter,previous_string,next_chapter,next_string)
d0e9b52bda73 Changed the algorithm for getting the titles ... Also added the ability to navigate to the next chapters
amit@thunder
parents: 41
diff changeset
    96
        
d0e9b52bda73 Changed the algorithm for getting the titles ... Also added the ability to navigate to the next chapters
amit@thunder
parents: 41
diff changeset
    97
        
d0e9b52bda73 Changed the algorithm for getting the titles ... Also added the ability to navigate to the next chapters
amit@thunder
parents: 41
diff changeset
    98
d0e9b52bda73 Changed the algorithm for getting the titles ... Also added the ability to navigate to the next chapters
amit@thunder
parents: 41
diff changeset
    99
d0e9b52bda73 Changed the algorithm for getting the titles ... Also added the ability to navigate to the next chapters
amit@thunder
parents: 41
diff changeset
   100
    
26
1846ab4ebdda Bug fixes and added a script for changes in final html
amit@thunder
parents:
diff changeset
   101
    reg_obj=re.compile('<head>.*</head>',re.DOTALL)    
1846ab4ebdda Bug fixes and added a script for changes in final html
amit@thunder
parents:
diff changeset
   102
    html_string=reg_obj.sub(replace_string, html_string,re.DOTALL)
1846ab4ebdda Bug fixes and added a script for changes in final html
amit@thunder
parents:
diff changeset
   103
    html_string=re.sub('><a name',' id', html_string)	
1846ab4ebdda Bug fixes and added a script for changes in final html
amit@thunder
parents:
diff changeset
   104
    soup=BeautifulSoup(html_string.decode('ascii','ignore'))    
44
d0e9b52bda73 Changed the algorithm for getting the titles ... Also added the ability to navigate to the next chapters
amit@thunder
parents: 41
diff changeset
   105
    soup.html.head.title.string.replaceWith(current_chapter_title) 
26
1846ab4ebdda Bug fixes and added a script for changes in final html
amit@thunder
parents:
diff changeset
   106
    div=soup.html.div
1846ab4ebdda Bug fixes and added a script for changes in final html
amit@thunder
parents:
diff changeset
   107
        
1846ab4ebdda Bug fixes and added a script for changes in final html
amit@thunder
parents:
diff changeset
   108
    try:
1846ab4ebdda Bug fixes and added a script for changes in final html
amit@thunder
parents:
diff changeset
   109
        del(div['title'])
1846ab4ebdda Bug fixes and added a script for changes in final html
amit@thunder
parents:
diff changeset
   110
        div['id'] = ch_name
1846ab4ebdda Bug fixes and added a script for changes in final html
amit@thunder
parents:
diff changeset
   111
    except TypeError:
1846ab4ebdda Bug fixes and added a script for changes in final html
amit@thunder
parents:
diff changeset
   112
        print file_name  
44
d0e9b52bda73 Changed the algorithm for getting the titles ... Also added the ability to navigate to the next chapters
amit@thunder
parents: 41
diff changeset
   113
    
d0e9b52bda73 Changed the algorithm for getting the titles ... Also added the ability to navigate to the next chapters
amit@thunder
parents: 41
diff changeset
   114
    soup.html.body.insert(0,NavigableString(body_add_string))
d0e9b52bda73 Changed the algorithm for getting the titles ... Also added the ability to navigate to the next chapters
amit@thunder
parents: 41
diff changeset
   115
d0e9b52bda73 Changed the algorithm for getting the titles ... Also added the ability to navigate to the next chapters
amit@thunder
parents: 41
diff changeset
   116
26
1846ab4ebdda Bug fixes and added a script for changes in final html
amit@thunder
parents:
diff changeset
   117
    return soup
1846ab4ebdda Bug fixes and added a script for changes in final html
amit@thunder
parents:
diff changeset
   118
1846ab4ebdda Bug fixes and added a script for changes in final html
amit@thunder
parents:
diff changeset
   119
if __name__=='__main__':
41
e54725be4df6 Changed paths dependent on repo location to be taken from the script also changed how the soup is printed
amit@thunder
parents: 40
diff changeset
   120
	file_names=glob.glob(os.path.join(repo,'ch*.html'))
26
1846ab4ebdda Bug fixes and added a script for changes in final html
amit@thunder
parents:
diff changeset
   121
	for file_name in file_names:
1846ab4ebdda Bug fixes and added a script for changes in final html
amit@thunder
parents:
diff changeset
   122
            file_obj=open(file_name,'r')
1846ab4ebdda Bug fixes and added a script for changes in final html
amit@thunder
parents:
diff changeset
   123
            soup=finalchanges(file_name,file_obj.read())
1846ab4ebdda Bug fixes and added a script for changes in final html
amit@thunder
parents:
diff changeset
   124
      	    time.sleep(1)
1846ab4ebdda Bug fixes and added a script for changes in final html
amit@thunder
parents:
diff changeset
   125
	    file_obj=open(file_name,'w')
40
ef147a79b098 Added change names to do required changes in names such that we get the chapter names beyond 10 in proper list
amit@thunder
parents: 39
diff changeset
   126
	    print >>file_obj ,soup
44
d0e9b52bda73 Changed the algorithm for getting the titles ... Also added the ability to navigate to the next chapters
amit@thunder
parents: 41
diff changeset
   127
            print file_name
26
1846ab4ebdda Bug fixes and added a script for changes in final html
amit@thunder
parents:
diff changeset
   128
1846ab4ebdda Bug fixes and added a script for changes in final html
amit@thunder
parents:
diff changeset
   129
1846ab4ebdda Bug fixes and added a script for changes in final html
amit@thunder
parents:
diff changeset
   130