parsing_data/slides.org
author Puneeth Chaganti <punchagan@fossee.in>
Wed, 20 Oct 2010 16:19:55 +0530
changeset 341 9f7eb1ed0e08
parent 280 40b6a90f41b7
permissions -rw-r--r--
Merged heads.
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
280
40b6a90f41b7 Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff changeset
     1
#+LaTeX_CLASS: beamer
40b6a90f41b7 Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff changeset
     2
#+LaTeX_CLASS_OPTIONS: [presentation]
40b6a90f41b7 Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff changeset
     3
#+BEAMER_FRAME_LEVEL: 1
40b6a90f41b7 Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff changeset
     4
40b6a90f41b7 Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff changeset
     5
#+BEAMER_HEADER_EXTRA: \usetheme{Warsaw}\usecolortheme{default}\useoutertheme{infolines}\setbeamercovered{transparent}
40b6a90f41b7 Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff changeset
     6
#+COLUMNS: %45ITEM %10BEAMER_env(Env) %10BEAMER_envargs(Env Args) %4BEAMER_col(Col) %8BEAMER_extra(Extra)
40b6a90f41b7 Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff changeset
     7
#+PROPERTY: BEAMER_col_ALL 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 :ETC
40b6a90f41b7 Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff changeset
     8
40b6a90f41b7 Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff changeset
     9
#+LaTeX_CLASS: beamer
40b6a90f41b7 Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff changeset
    10
#+LaTeX_CLASS_OPTIONS: [presentation]
40b6a90f41b7 Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff changeset
    11
40b6a90f41b7 Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff changeset
    12
#+LaTeX_HEADER: \usepackage[english]{babel} \usepackage{ae,aecompl}
40b6a90f41b7 Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff changeset
    13
#+LaTeX_HEADER: \usepackage{mathpazo,courier,euler} \usepackage[scaled=.95]{helvet}
40b6a90f41b7 Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff changeset
    14
40b6a90f41b7 Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff changeset
    15
#+LaTeX_HEADER: \usepackage{listings}
40b6a90f41b7 Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff changeset
    16
40b6a90f41b7 Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff changeset
    17
#+LaTeX_HEADER:\lstset{language=Python, basicstyle=\ttfamily\bfseries,
40b6a90f41b7 Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff changeset
    18
#+LaTeX_HEADER:  commentstyle=\color{red}\itshape, stringstyle=\color{darkgreen},
40b6a90f41b7 Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff changeset
    19
#+LaTeX_HEADER:  showstringspaces=false, keywordstyle=\color{blue}\bfseries}
40b6a90f41b7 Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff changeset
    20
40b6a90f41b7 Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff changeset
    21
#+TITLE:    Parsing Data
40b6a90f41b7 Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff changeset
    22
#+AUTHOR:    FOSSEE
40b6a90f41b7 Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff changeset
    23
#+EMAIL:     
40b6a90f41b7 Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff changeset
    24
#+DATE:    
40b6a90f41b7 Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff changeset
    25
40b6a90f41b7 Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff changeset
    26
#+DESCRIPTION: 
40b6a90f41b7 Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff changeset
    27
#+KEYWORDS: 
40b6a90f41b7 Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff changeset
    28
#+LANGUAGE:  en
40b6a90f41b7 Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff changeset
    29
#+OPTIONS:   H:3 num:nil toc:nil \n:nil @:t ::t |:t ^:t -:t f:t *:t <:t
40b6a90f41b7 Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff changeset
    30
#+OPTIONS:   TeX:t LaTeX:nil skip:nil d:nil todo:nil pri:nil tags:not-in-toc
40b6a90f41b7 Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff changeset
    31
40b6a90f41b7 Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff changeset
    32
* Outline
40b6a90f41b7 Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff changeset
    33
  - What is meant by parsing data? 
40b6a90f41b7 Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff changeset
    34
  - String operations required for parsing
40b6a90f41b7 Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff changeset
    35
  - Converting between data-types. 
40b6a90f41b7 Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff changeset
    36
* Question 1
40b6a90f41b7 Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff changeset
    37
  Split the variable line using a space as argument. Is it same as
40b6a90f41b7 Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff changeset
    38
  splitting without an argument ?
40b6a90f41b7 Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff changeset
    39
* Solution 1
40b6a90f41b7 Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff changeset
    40
  We see that when we split on space, multiple whitespaces are not
40b6a90f41b7 Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff changeset
    41
  clubbed as one and there is an empty string everytime there are two
40b6a90f41b7 Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff changeset
    42
  consecutive spaces.
40b6a90f41b7 Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff changeset
    43
* Question 2
40b6a90f41b7 Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff changeset
    44
  What happens to the white space inside the sentence when it is
40b6a90f41b7 Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff changeset
    45
  stripped? 
40b6a90f41b7 Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff changeset
    46
* Solution 2
40b6a90f41b7 Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff changeset
    47
  #+begin_src python
40b6a90f41b7 Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff changeset
    48
    In []: a_str = "     white      space     "
40b6a90f41b7 Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff changeset
    49
    In []: a_str.strip()
40b6a90f41b7 Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff changeset
    50
  #+end_src
40b6a90f41b7 Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff changeset
    51
* Question 3
40b6a90f41b7 Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff changeset
    52
  What happens if you do =int("1.25")=
40b6a90f41b7 Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff changeset
    53
* Solution 3
40b6a90f41b7 Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff changeset
    54
  It raises an error since converting a float string into integer
40b6a90f41b7 Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff changeset
    55
  directly is not possible. It involves an intermediate step of
40b6a90f41b7 Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff changeset
    56
  converting to float.
40b6a90f41b7 Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff changeset
    57
  #+begin_src python
40b6a90f41b7 Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff changeset
    58
    In []: dcml_str = "1.25"
40b6a90f41b7 Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff changeset
    59
    In []: flt = float(dcml_str)
40b6a90f41b7 Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff changeset
    60
    In []: flt
40b6a90f41b7 Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff changeset
    61
    In []: number = int(flt)
40b6a90f41b7 Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff changeset
    62
    In []: number
40b6a90f41b7 Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff changeset
    63
  #+end_src
40b6a90f41b7 Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff changeset
    64
* Summary
40b6a90f41b7 Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff changeset
    65
  + How to tokenize a string using various delimiters
40b6a90f41b7 Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff changeset
    66
  + How to get rid of extra white space around
40b6a90f41b7 Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff changeset
    67
  + How to convert from one type to another
40b6a90f41b7 Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff changeset
    68
  + How to parse input data and perform computations on it
40b6a90f41b7 Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff changeset
    69
* Thank you!
40b6a90f41b7 Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff changeset
    70
#+begin_latex
40b6a90f41b7 Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff changeset
    71
  \begin{block}{}
40b6a90f41b7 Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff changeset
    72
  \begin{center}
40b6a90f41b7 Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff changeset
    73
  This spoken tutorial has been produced by the
40b6a90f41b7 Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff changeset
    74
  \textcolor{blue}{FOSSEE} team, which is funded by the 
40b6a90f41b7 Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff changeset
    75
  \end{center}
40b6a90f41b7 Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff changeset
    76
  \begin{center}
40b6a90f41b7 Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff changeset
    77
    \textcolor{blue}{National Mission on Education through \\
40b6a90f41b7 Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff changeset
    78
      Information \& Communication Technology \\ 
40b6a90f41b7 Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff changeset
    79
      MHRD, Govt. of India}.
40b6a90f41b7 Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff changeset
    80
  \end{center}  
40b6a90f41b7 Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff changeset
    81
  \end{block}
40b6a90f41b7 Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff changeset
    82
#+end_latex
40b6a90f41b7 Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff changeset
    83
40b6a90f41b7 Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff changeset
    84