\item Draw a pie chart representing the number of students who scored more than 90\% in Science per region.
\item Draw a pie chart representing the number of students who scored more than 90\% per subject(All regions combined).
+ \item Print mean, median, mode and standard deviation of math scores for all regions combined.
\frametitle{Statistical Analysis and Parsing \ldots}
Machinery Required -
- \item File reading and parsing
+ \item File reading
+ \item Parsing
\item Dictionaries
+ \item NumPy arrays
+ \item Statistical operations
\frametitle{File reading and parsing}
Understanding the structure of sslc1.txt
- \item Each line in the file, i.e each row of a file is a single record.
- \item Each record corresponds to a record of a single student
+ \item Each line in the file corresponds to one student's details
+ \item aka record
\item Each record consists of several fields separated by a ';'
\item Region Code
\item Roll Number
\item Name
- \item Marks of 5 subjects
+ \item Marks of 5 subjects: English, Hindi, Maths, Science, Social
\item Total marks
- \item Pass (P)
+ \item Pass/Fail (P/F)
\item Withdrawn (W)
- \item Fail (F)
- \frametitle{Dictionary - Building parsed data}
+ \frametitle{Dictionary: Introduction}
- \item Let the parsed data be stored in list of dictionaries.
- \item d = \{\} is an empty dictionary
+ \item lists index: 0 \ldots n
+ \item dictionaries index using strings
+d = \{ ``Hitchhiker's guide'' : 42,
+ ``Terminator'' : ``I'll be back''\}\\
+d[``Terminator''] => ``I'll be back''
+ \frametitle{Dictionary: Introduction}
+In [1]: d = {"Hitchhiker's guide" : 42,
+ "Terminator" : "I'll be back"}
+In [2]: d["Hitchhiker's guide"]
+Out[2]: 42
+In [3]: "Hitchhiker's guide" in d
+Out[3]: True
+In [4]: "Guido" in d
+Out[4]: False
- \frametitle{Dictionary - Building parsed data}
+ \frametitle{Dictionary: Introduction}
+In [5]: d.keys()
+Out[5]: ['Terminator', "Hitchhiker's
+ guide"]
+In [6]: d.values()
+Out[6]: ["I'll be back", 42]
+ \frametitle{enumerate: Iterating through list indices}
-ninety_percents = [{}, {}, {}, {}, {}]
+In [1]: names = ["Guido","Alex", "Tim"]
+In [2]: for i, name in enumerate(names):
+ ...: print i, name
+ ...:
+0 Guido
+1 Alex
+2 Tim
+ \frametitle{Dictionary: Building parsed data}
+ Let our dictionary be:
+ \begin{lstlisting}
+science = {} # is an empty dictionary
+ \end{lstlisting}
\frametitle{Dictionary - Building parsed data}
- \item Index of a dictionary is called a \emph{key}
- \item \emph{Keys} of these dictionaries are strings - region codes
+ \item \emph{Keys} of \emph{science} will be region codes
+ \item Value of a \emph{science} will be the number students who scored more than 90\% in that region
- \frametitle{Dictionary - Building parsed data \ldots}
- \begin{itemize}
- \item Value of a \emph{key} can be any legal Python value
- \item In this problem let the value of a \emph{key} be another an integer
- \item This dictionary contains:
- \end{itemize}
-'region code': Number of students who scored more than 90\% in this region for this subject
\frametitle{Building parsed data \ldots}
-from pylab import *
+from pylab import pie
-ninety_percents = [{}, {}, {}, {}, {}]
+science = {}
for record in open('sslc1.txt'):
record = record.strip()
\frametitle{Building parsed data \ldots}
+ \begin{lstlisting}
+if region_code not in science:
+ science[region_code] = 0
+score_str = fields[4].strip()
+score = int(score_str) if
+ score_str != 'AA' else 0
+if score > 90:
+ science[region_code] += 1
+ \end{lstlisting}
+ \frametitle{Pie charts}
+ \small
+ \begin{lstlisting}
+ labels=science.keys())
+title('Students scoring 90% and above
+ in science by region')
+ \end{lstlisting}
+ \column{5.25\textwidth}
+ \hspace*{1.1in}
+\includegraphics[height=2in, interpolate=true]{data/science}
+ \column{0.8\textwidth}
+ \frametitle{Building data for all subjects \ldots}
+ \begin{lstlisting}
+from pylab import pie
+from scipy import mean, median, std
+from scipy import stats
+scores = [[]] * 5
+ninety_percents = [{}] * 5
+ \end{lstlisting}
+ \frametitle{Building data for all subjects \ldots}
+ \begin{lstlisting}
+from pylab import pie
+from scipy import mean, median, std
+from scipy import stats
+ \end{lstlisting}
+ \begin{block}{Repeating list items}
+ \begin{lstlisting}
+scores = [[]] * 5
+ninety_percents = [{}] * 5
+ \end{lstlisting}
+ \end{block}
+ \frametitle{Building data for all subjects \ldots}
+ \begin{lstlisting}
+for record in open('sslc1.txt'):
+ record = record.strip()
+ fields = record.split(';')
+ region_code = fields[0].strip()
+ \end{lstlisting}
+ \frametitle{Building data for all subjects \ldots}
for i, field in enumerate(fields[3:8]):
if region_code not in ninety_percents[i]:
ninety_percents[i][region_code] = 0
score_str = field.strip()
+ score = int(score_str) if
+ score_str != 'AA' else 0
- score = 0 if score_str == 'AA' else
- int(score_str)
+ scores[i].append(score)
if score > 90:
ninety_percents[i][region_code] += 1
\frametitle{Pie charts}
- \small
- \begin{lstlisting}
- labels=ninety_percents[1].keys())
-title('Students scoring 90% and above
- in science by region')
- \end{lstlisting}
- \column{5.25\textwidth}
- \hspace*{1.1in}
-\includegraphics[height=2in, interpolate=true]{data/science}
- \column{0.8\textwidth}
- \frametitle{Pie charts}
pie(subj_total, labels=['English',
@@ -299,6 +397,32 @@
\includegraphics[height=3in, interpolate=true]{data/all_regions}
+ \frametitle{Obtaining statistics}
+ \begin{lstlisting}
+math_scores = array(scores[2])
+print "Mean: ", mean(math_scores)
+print "Median: ", median(math_scores)
+print "Mode: ", stats.mode(math_scores)
+print "Standard Deviation: ",
+ std(math_scores)
+ \end{lstlisting}
+ \frametitle{What tools did we use?}
+ \begin{itemize}
+ \item Dictionaries for storing data
+ \item Facilities for drawing pie charts
+ \item NumPy arrays for efficient array manipulations
+ \item Functions for statistical computations - mean, median, mode, standard deviation
+ \end{itemize}
\frametitle{L vs $T^2$ \ldots}
Let's go back to the L vs $T^2$ plot
- \frametitle{What did we learn?}
- \begin{itemize}
- \item Dictionaries
- \item Drawing pie charts
- \item Arrays
- \item Least Square fitting
- \item Intro to Matrices
- \end{itemize}
This Python idiom works for all types of variables.\\
They need not be of the same type!
- \inctime{}
+ \inctime{10}
\section{Control flow}
+\begin{frame}{Problem 1.1}
+ The aliquot of a number is defined as: the sum of the \emph{proper} divisors of the number. For example, aliquot(12) = 1 + 2 + 3 + 4 + 6 = 16.\\
+ Write a function that returns the aliquot number of a given number.
+\begin{frame}{Problem 1.2}
+ A pair of numbers (a, b) is said to be \alert{amicable} if the aliquot number of a is b and the aliquot number of b is a.\\
+ Example: \texttt{220, 284}\\
+ Write a program that prints all four digit amicable pairs.
+%% \begin{frame}{Problem 2}
+%% Given an empty chessboard and one Bishop placed in any s%quare, say (r, c), generate the list of all squares the Bi%shop could move to.
+%% \end{frame}
+ \frametitle{Problem Set 2}
+ Given a string like, ``1, 3-7, 12, 15, 18-21'', produce the list \\
+ \begin{lstlisting}
+ [1,3,4,5,6,7,12,15,18,19,20,21]
+ \end{lstlisting}
+ \frametitle{Problem Set 3}
+ \begin{description}
+ \item[3.1] Count word frequencies in a file.
+ \frametitle{Problem set 4}
+ Finite difference
+ \begin{equation*}
+ \frac{sin(x+h)-sin(x)}{h}
+ \end{equation*}
+ \begin{lstlisting}
+ >>> x = linspace(0,2*pi,100)
+ >>> y = sin(x)
+ >>> deltax = x[1] - x[0]
+ \end{lstlisting}
+ \pause
+ \begin{enumerate}
+ \item Given this, get the finite difference of sin in the range 0 to 2*pi
+ \end{enumerate}
+ \frametitle{Problem Set 5}
+ \begin{itemize}
+ \item[5.1] Write a function that plots any regular n-gon given \typ{n}.
+ \item[5.2] Consider the logistic map, $f(x) = kx(1-x)$, plot it for
+ $k=2.5, 3.5$ and $4$ in the same plot.
+\frametitle{Problem Set 5}
+ \begin{columns}
+ \column{0.6\textwidth}
+ \small{
+ \begin{itemize}
+ \item[3] Consider the iteration $x_{n+1} = f(x_n)$ where $f(x) = kx(1-x)$. Plot the successive iterates of this process as explained below.
+ \end{itemize}}
+ \column{0.35\textwidth}
+ \hspace*{-0.5in}
+ \includegraphics[height=1.6in, interpolate=true]{data/cobweb}
+ \frametitle{Problem Set 5.3}
+ Plot the cobweb plot as follows:
+ \begin{enumerate}
+ \item Start at $(x_0, 0)$ ($\implies$ i=0)
+ \item Draw a line to $(x_i, f(x_i))$
+ \item Set $x_{i+1} = f(x_i)$
+ \item Draw a line to $(x_{i+1}, x_{i+1})$
+ \item $(i\implies i+1)$
+ \item Repeat from 2 for as long as you want
+ \end{enumerate}