author | Puneeth Chaganti <punchagan@fossee.in> |
Wed, 20 Oct 2010 16:19:55 +0530 | |
changeset 341 | 9f7eb1ed0e08 |
parent 280 | 40b6a90f41b7 |
permissions | -rw-r--r-- |
280
40b6a90f41b7
Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff
changeset
|
1 |
#+LaTeX_CLASS: beamer |
40b6a90f41b7
Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff
changeset
|
2 |
#+LaTeX_CLASS_OPTIONS: [presentation] |
40b6a90f41b7
Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff
changeset
|
3 |
#+BEAMER_FRAME_LEVEL: 1 |
40b6a90f41b7
Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff
changeset
|
4 |
|
40b6a90f41b7
Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff
changeset
|
5 |
#+BEAMER_HEADER_EXTRA: \usetheme{Warsaw}\usecolortheme{default}\useoutertheme{infolines}\setbeamercovered{transparent} |
40b6a90f41b7
Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff
changeset
|
6 |
#+COLUMNS: %45ITEM %10BEAMER_env(Env) %10BEAMER_envargs(Env Args) %4BEAMER_col(Col) %8BEAMER_extra(Extra) |
40b6a90f41b7
Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff
changeset
|
7 |
#+PROPERTY: BEAMER_col_ALL 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 :ETC |
40b6a90f41b7
Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff
changeset
|
8 |
|
40b6a90f41b7
Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff
changeset
|
9 |
#+LaTeX_CLASS: beamer |
40b6a90f41b7
Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff
changeset
|
10 |
#+LaTeX_CLASS_OPTIONS: [presentation] |
40b6a90f41b7
Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff
changeset
|
11 |
|
40b6a90f41b7
Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff
changeset
|
12 |
#+LaTeX_HEADER: \usepackage[english]{babel} \usepackage{ae,aecompl} |
40b6a90f41b7
Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff
changeset
|
13 |
#+LaTeX_HEADER: \usepackage{mathpazo,courier,euler} \usepackage[scaled=.95]{helvet} |
40b6a90f41b7
Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff
changeset
|
14 |
|
40b6a90f41b7
Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff
changeset
|
15 |
#+LaTeX_HEADER: \usepackage{listings} |
40b6a90f41b7
Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff
changeset
|
16 |
|
40b6a90f41b7
Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff
changeset
|
17 |
#+LaTeX_HEADER:\lstset{language=Python, basicstyle=\ttfamily\bfseries, |
40b6a90f41b7
Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff
changeset
|
18 |
#+LaTeX_HEADER: commentstyle=\color{red}\itshape, stringstyle=\color{darkgreen}, |
40b6a90f41b7
Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff
changeset
|
19 |
#+LaTeX_HEADER: showstringspaces=false, keywordstyle=\color{blue}\bfseries} |
40b6a90f41b7
Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff
changeset
|
20 |
|
40b6a90f41b7
Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff
changeset
|
21 |
#+TITLE: Parsing Data |
40b6a90f41b7
Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff
changeset
|
22 |
#+AUTHOR: FOSSEE |
40b6a90f41b7
Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff
changeset
|
23 |
#+EMAIL: |
40b6a90f41b7
Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff
changeset
|
24 |
#+DATE: |
40b6a90f41b7
Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff
changeset
|
25 |
|
40b6a90f41b7
Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff
changeset
|
26 |
#+DESCRIPTION: |
40b6a90f41b7
Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff
changeset
|
27 |
#+KEYWORDS: |
40b6a90f41b7
Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff
changeset
|
28 |
#+LANGUAGE: en |
40b6a90f41b7
Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff
changeset
|
29 |
#+OPTIONS: H:3 num:nil toc:nil \n:nil @:t ::t |:t ^:t -:t f:t *:t <:t |
40b6a90f41b7
Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff
changeset
|
30 |
#+OPTIONS: TeX:t LaTeX:nil skip:nil d:nil todo:nil pri:nil tags:not-in-toc |
40b6a90f41b7
Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff
changeset
|
31 |
|
40b6a90f41b7
Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff
changeset
|
32 |
* Outline |
40b6a90f41b7
Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff
changeset
|
33 |
- What is meant by parsing data? |
40b6a90f41b7
Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff
changeset
|
34 |
- String operations required for parsing |
40b6a90f41b7
Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff
changeset
|
35 |
- Converting between data-types. |
40b6a90f41b7
Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff
changeset
|
36 |
* Question 1 |
40b6a90f41b7
Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff
changeset
|
37 |
Split the variable line using a space as argument. Is it same as |
40b6a90f41b7
Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff
changeset
|
38 |
splitting without an argument ? |
40b6a90f41b7
Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff
changeset
|
39 |
* Solution 1 |
40b6a90f41b7
Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff
changeset
|
40 |
We see that when we split on space, multiple whitespaces are not |
40b6a90f41b7
Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff
changeset
|
41 |
clubbed as one and there is an empty string everytime there are two |
40b6a90f41b7
Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff
changeset
|
42 |
consecutive spaces. |
40b6a90f41b7
Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff
changeset
|
43 |
* Question 2 |
40b6a90f41b7
Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff
changeset
|
44 |
What happens to the white space inside the sentence when it is |
40b6a90f41b7
Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff
changeset
|
45 |
stripped? |
40b6a90f41b7
Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff
changeset
|
46 |
* Solution 2 |
40b6a90f41b7
Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff
changeset
|
47 |
#+begin_src python |
40b6a90f41b7
Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff
changeset
|
48 |
In []: a_str = " white space " |
40b6a90f41b7
Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff
changeset
|
49 |
In []: a_str.strip() |
40b6a90f41b7
Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff
changeset
|
50 |
#+end_src |
40b6a90f41b7
Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff
changeset
|
51 |
* Question 3 |
40b6a90f41b7
Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff
changeset
|
52 |
What happens if you do =int("1.25")= |
40b6a90f41b7
Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff
changeset
|
53 |
* Solution 3 |
40b6a90f41b7
Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff
changeset
|
54 |
It raises an error since converting a float string into integer |
40b6a90f41b7
Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff
changeset
|
55 |
directly is not possible. It involves an intermediate step of |
40b6a90f41b7
Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff
changeset
|
56 |
converting to float. |
40b6a90f41b7
Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff
changeset
|
57 |
#+begin_src python |
40b6a90f41b7
Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff
changeset
|
58 |
In []: dcml_str = "1.25" |
40b6a90f41b7
Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff
changeset
|
59 |
In []: flt = float(dcml_str) |
40b6a90f41b7
Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff
changeset
|
60 |
In []: flt |
40b6a90f41b7
Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff
changeset
|
61 |
In []: number = int(flt) |
40b6a90f41b7
Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff
changeset
|
62 |
In []: number |
40b6a90f41b7
Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff
changeset
|
63 |
#+end_src |
40b6a90f41b7
Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff
changeset
|
64 |
* Summary |
40b6a90f41b7
Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff
changeset
|
65 |
+ How to tokenize a string using various delimiters |
40b6a90f41b7
Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff
changeset
|
66 |
+ How to get rid of extra white space around |
40b6a90f41b7
Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff
changeset
|
67 |
+ How to convert from one type to another |
40b6a90f41b7
Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff
changeset
|
68 |
+ How to parse input data and perform computations on it |
40b6a90f41b7
Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff
changeset
|
69 |
* Thank you! |
40b6a90f41b7
Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff
changeset
|
70 |
#+begin_latex |
40b6a90f41b7
Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff
changeset
|
71 |
\begin{block}{} |
40b6a90f41b7
Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff
changeset
|
72 |
\begin{center} |
40b6a90f41b7
Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff
changeset
|
73 |
This spoken tutorial has been produced by the |
40b6a90f41b7
Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff
changeset
|
74 |
\textcolor{blue}{FOSSEE} team, which is funded by the |
40b6a90f41b7
Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff
changeset
|
75 |
\end{center} |
40b6a90f41b7
Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff
changeset
|
76 |
\begin{center} |
40b6a90f41b7
Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff
changeset
|
77 |
\textcolor{blue}{National Mission on Education through \\ |
40b6a90f41b7
Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff
changeset
|
78 |
Information \& Communication Technology \\ |
40b6a90f41b7
Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff
changeset
|
79 |
MHRD, Govt. of India}. |
40b6a90f41b7
Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff
changeset
|
80 |
\end{center} |
40b6a90f41b7
Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff
changeset
|
81 |
\end{block} |
40b6a90f41b7
Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff
changeset
|
82 |
#+end_latex |
40b6a90f41b7
Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff
changeset
|
83 |
|
40b6a90f41b7
Slides for parsing data LO.
Puneeth Chaganti <punchagan@fossee.in>
parents:
diff
changeset
|
84 |