1 Module Objectives: |
|
2 ================== |
|
3 |
|
4 After successfully completing this module a participant will be able to: :: |
|
5 |
|
6 - Understand |
|
7 * What are archives and zipped files U |
|
8 * What are environment variables U |
|
9 * What are Shell Scripts U |
|
10 - Able to use file comparison commands like Ap |
|
11 diff, cmp, comm |
|
12 - Create and extract archives(.tar files) and zipped files(.gz) Ap |
|
13 - Set/Modify environment as per need Ap |
|
14 - Create shell scripts to automate tasks. Ap |
|
15 |
|
16 tar: |
1 tar: |
17 ==== |
2 ==== |
18 |
3 |
19 Introduction: |
4 Introduction: |
20 ------------- |
5 ------------- |
21 |
6 |
22 In world of Linux based distribution, *tarballs* is the term which pops up very often. It is part of the GNU project and comes as part of every distribution of GNU/Linux. Tarball is like defacto standard for releasing source code for free software. Some of common use of *tar* archives is to: *Store, backup, and transport*. |
7 In world of Linux based distribution, *tarballs* is the term which pops up very often. It is part of the GNU project and comes as part of every distribution of GNU/Linux. Tarball is like defacto standard for releasing source code for free software. Some of common use of *tar* archives is to: *Store, backup, and transport*. |
23 |
8 |
24 GNU tar creates and manipulates archives which are actually collections of many other files; the program provides users with an organized and systematic method for controlling a large amount of data. It is basically form of creating archive by concatenating one or more files. |
9 GNU tar creates and manipulates archives which are actually collections of many other files; the program provides users with an organized and systematic method for controlling a large amount of data. It is basically form of creating archive by concatenating one or more files. |
25 |
10 |
26 Getting Started(go go go!): |
11 Getting Started: |
27 --------------------------- |
12 --------------------------- |
28 |
13 |
29 As mentioned previously and if not, *The best way to get started with any command line tool of Linux is to use "man".* :: |
14 As mentioned previously and if not, *The best way to get started with any command line tool of Linux is to use "man".* :: |
30 |
15 |
31 $ man tar |
16 $ man tar |
149 |
134 |
150 $ tar -x second.txt -vf allfiles.tar |
135 $ tar -x second.txt -vf allfiles.tar |
151 second.txt |
136 second.txt |
152 |
137 |
153 |
138 |
154 Further Reading for this section: |
|
155 --------------------------------- |
|
156 |
|
157 * http://en.wikipedia.org/wiki/Tar_(file_format) |
|
158 * http://www.gnu.org/software/tar/manual/tar.html |
|
159 * http://linuxreviews.org/beginner/ |
|
160 |
139 |
161 GZip: |
140 GZip: |
162 ===== |
141 ===== |
163 |
142 |
164 Tar creates archives but it does not compress data by itself unless specified explicitly. Hence all the archive we create using tar command, is simply of the size of total size of all individual files. With Linux there is a compression tool known as *gzip* which is used to reduce the size of files mentioned. Whenever possible, each file is replaced by one with the extension `.gz', so unlike `tar` this command would *replace the existing file*. |
143 Tar creates archives but it does not compress data by itself unless specified explicitly. Hence all the archive we create using tar command, is simply of the size of total size of all individual files. With Linux there is a compression tool known as *gzip* which is used to reduce the size of files mentioned. Whenever possible, each file is replaced by one with the extension `.gz', so unlike `tar` this command would *replace the existing file*. |
194 first.txt: 4.4% -- replaced with first.txt.gz |
173 first.txt: 4.4% -- replaced with first.txt.gz |
195 fourth.txt: -7.1% -- replaced with fourth.txt.gz |
174 fourth.txt: -7.1% -- replaced with fourth.txt.gz |
196 second.txt: -4.8% -- replaced with second.txt.gz |
175 second.txt: -4.8% -- replaced with second.txt.gz |
197 third.txt: 3.8% -- replaced with third.txt.gz |
176 third.txt: 3.8% -- replaced with third.txt.gz |
198 |
177 |
199 For files of very small sizes and some other cases, one might end up with a zipped file whose size is greater then original file, but compression is always performed(so don't be disheartened in the above case, as files are larger :P). So unlike tar, here all files are zipped separately by default, to make them part of one single chunk one can use some *pipes* and *redirections* :: |
178 |
200 |
179 |
201 $ gzip -c *.txt > all.gz |
180 $ gzip -c *.txt > all.gz |
202 |
181 |
203 Now in this case, all files would be zipped, concatenated and then the output would be written to a file all.gz leaving back all the original files. In the command above *`-c`* option states to print the output to standard output(stdout) and following *`>`* would redirect the output to file all.gz. So when we decompress this file, we will get a single file named 'all' with all the content of each files concatenated one after the another. |
182 Now in this case, all files would be zipped, concatenated and then the output would be written to a file all.gz leaving back all the original files. In the command above *`-c`* option states to print the output to standard output(stdout) and following *`>`* would redirect the output to file all.gz. So when we decompress this file, we will get a single file named 'all' with all the content of each files concatenated one after the another. |
204 |
183 |
210 ./second.txt: -4.8% -- replaced with ./second.txt.gz |
189 ./second.txt: -4.8% -- replaced with ./second.txt.gz |
211 ./third.txt: 3.8% -- replaced with ./third.txt.gz |
190 ./third.txt: 3.8% -- replaced with ./third.txt.gz |
212 ./allfiles.tar: 96.6% -- replaced with ./allfiles.tar.gz |
191 ./allfiles.tar: 96.6% -- replaced with ./allfiles.tar.gz |
213 ./fourth.txt: -7.1% -- replaced with ./fourth.txt.gz |
192 ./fourth.txt: -7.1% -- replaced with ./fourth.txt.gz |
214 |
193 |
215 Hence one always sees files like xxxxx.tar.gz, to create a zip of whole directory in a single file, first archive everything inside a folder and then use gzip on that. For zipping the files using tar itself, one has to use the option *`g`*. :: |
194 Hence one always sees files like something.tar.gz, to create a zip of whole directory in a single file, first archive everything inside a folder and then use gzip on that. For zipping the files using tar itself, one has to use the option *`g`*. :: |
|
195 |
|
196 |
216 |
197 |
217 $ tar -cvzf zipped.tar.gz *.txt |
198 $ tar -cvzf zipped.tar.gz *.txt |
218 first.txt |
199 first.txt |
219 fourth.txt |
200 fourth.txt |
220 second.txt |
201 second.txt |
221 third.txt |
202 third.txt |
222 |
203 |
223 *That's why gzip is designed as a complement to tar, not as a replacement.* |
204 *Thats why gzip is designed as a complement to tar, not as a replacement.* |
224 |
205 |
225 gzip command comes with a option *`-l`* to view the compressed file contents: :: |
206 gzip command comes with a option *`-l`* to view the compressed file contents: :: |
226 |
207 |
227 $ gzip -l zipped.tar.gz |
208 $ gzip -l zipped.tar.gz |
228 compressed uncompressed ratio uncompressed_name |
209 compressed uncompressed ratio uncompressed_name |
229 332 10240 97.0% zipped.tar |
210 332 10240 97.0% zipped.tar |
230 |
211 |
231 Other feature of gzip is option for mentioning the kind of compression one wants. There is a option of *`-n`* where *n varies from 0 to 9* which regulate the speed/quality of compression. With *`-1`* or *`--fast`* option it means the fastest compression method (less compression) and *`--best`* or *`-9`* indicates the slowest compression method, default compression level is *`-6`*. |
212 |
232 |
213 |
233 To decompress a already compressed file there are two options, either use *`gunzip`* command or use *`-d`* option with gzip command: :: |
214 To decompress a already compressed file there are two options, either use *`gunzip`* command or use *`-d`* option with gzip command: :: |
234 |
215 |
235 $ gzip -dv *.gz |
216 $ gzip -dv *.gz |
236 all.gz: -440.4% -- replaced with all |
217 all.gz: -440.4% -- replaced with all |
261 Linux based distributions also have some utilities for checking the content of files, comparing them very quickly to other files. These operations can be looking for differences/similarities. Some of the commands which prove handy are: |
242 Linux based distributions also have some utilities for checking the content of files, comparing them very quickly to other files. These operations can be looking for differences/similarities. Some of the commands which prove handy are: |
262 |
243 |
263 cmp: |
244 cmp: |
264 ---- |
245 ---- |
265 |
246 |
266 If one wants to compare two files whether they are same or not, one can use this handy tool. Let us consider some situation, we run find/locate command to locate some file, and it turns out that we have a file with same name in different location, and in case we want to run a quick check on there content, cmp is the right tool. For my system I perform these tasks to illustrate the use of this command: :: |
247 If one wants to compare two files whether they are same or not, one can use this handy tool. Let us consider some situation, we run find/locate command to locate some file, and it turns out that we have a file with same name in different location, and in case we want to run a quick check on there content, cmp is the right tool. Usage :: |
267 |
|
268 $ find . -name quick.c |
|
269 ./Desktop/programs/quick.c |
|
270 ./c-folder/quick.c |
|
271 $ cmp Desktop/programs/quick.c c-folder/quick.c |
|
272 $ |
|
273 |
|
274 For me it returns nothing, hence that means both the files are exact copy of each other, by default, cmp is silent if the files are the same. Make some changes in one of the file and rerun the command. For me it works like this: :: |
|
275 |
248 |
276 $ cmp Desktop/programs/quick.c c-folder/quick.c |
249 $ cmp Desktop/programs/quick.c c-folder/quick.c |
277 Desktop/programs/quick.c c-folder/quick.c differ: byte 339, line 24 |
250 Desktop/programs/quick.c c-folder/quick.c differ: byte 339, line 24 |
278 |
251 |
279 That is, if files differ, the byte and line number at which the first difference occurred is reported. |
252 That is, if files differ, the byte and line number at which the first difference occurred is reported. |
281 diff: |
254 diff: |
282 ----- |
255 ----- |
283 |
256 |
284 Now there are situations when one wants to exactly know the differences among two files, for them, GNU diff can show whether files are different without detailing the differences. For simple and basic usage of this programs, consider following example: :: |
257 Now there are situations when one wants to exactly know the differences among two files, for them, GNU diff can show whether files are different without detailing the differences. For simple and basic usage of this programs, consider following example: :: |
285 |
258 |
286 $ echo -e "quick\nbrown\nfox\njumped\nover\nthe\nlazy\ndog" > allcharacters.txt |
259 $ echo -e "quick\nbrown\nfox\njumped\nover\nthe\nlazy\ndog" > allcorrect.txt |
287 $ echo -e "quick\nbrown\nfox\njmuped\nover\nteh\nlzay\ndog" > problem.txt |
260 $ echo -e "quick\nbrown\nfox\njmuped\nover\nteh\nlzay\ndog" > incorrect.txt |
288 $ diff problem.txt allcharacters.txt |
261 $ diff problem.txt allc.txt |
289 4c4 |
262 4c4 |
290 < jmuped |
263 < jmuped |
291 --- |
264 --- |
292 > jumped |
265 > jumped |
293 6,7c6,7 |
266 6,7c6,7 |
297 > the |
270 > the |
298 > lazy |
271 > lazy |
299 |
272 |
300 Looking at results above mentioned it is very trivial to deduce that, diff if used on two separate text files will result in line by line results for all the lines which are different. So most common use case scenario can be, got some files in various location of system with same name and size, just run diff through them and remove all the redundant files. Other similar command which one can find more effective for this can be *sdiff*, for the same files using sdiff will result in: :: |
273 Looking at results above mentioned it is very trivial to deduce that, diff if used on two separate text files will result in line by line results for all the lines which are different. So most common use case scenario can be, got some files in various location of system with same name and size, just run diff through them and remove all the redundant files. Other similar command which one can find more effective for this can be *sdiff*, for the same files using sdiff will result in: :: |
301 |
274 |
302 $ sdiff problem.txt allcharacters.txt |
275 $ sdiff incorrect.txt allcorrect.txt |
303 quick quick |
276 quick quick |
304 brown brown |
277 brown brown |
305 fox fox |
278 fox fox |
306 jmuped | jumped |
279 jmuped | jumped |
307 over over |
280 over over |
308 teh | the |
281 teh | the |
309 lzay | lazy |
282 lzay | lazy |
310 dog dog |
283 dog dog |
311 |
284 |
312 Some exercise for a change: |
|
313 |
|
314 * Try using diff for any binary file, does it work? |
|
315 * What are other equivalent for diff command based on needs/requirements? |
|
316 * Can we use diff to compare two directories? If yes how? |
|
317 |
285 |
318 comm: |
286 comm: |
319 ----- |
287 ----- |
320 |
288 |
321 This is one more command which proves handy at times, the short and sweet man page states "compare two sorted files line by line". Or this it compares sorted files and selects or rejects lines common to two files. For ex: :: |
289 This is one more command which proves handy at times, the short and sweet man page states "compare two sorted files line by line". Or this it compares sorted files and selects or rejects lines common to two files. For ex: :: |
322 |
290 |
323 $ sort allcharacters.txt>sortedcharac.txt; sort problem.txt>sortedprob.txt |
291 $ sort allcorrect.txt>sortedcharac.txt; sort incorrect.txt>sortedprob.txt |
324 $ comm sortedcharac.txt sortedprob.txt |
292 $ comm sortedcharac.txt sortedprob.txt |
325 brown |
293 brown |
326 dog |
294 dog |
327 fox |
295 fox |
328 jmuped |
296 jmuped |
337 Environment Variables: |
305 Environment Variables: |
338 ====================== |
306 ====================== |
339 |
307 |
340 These variables like HOME, OSTYPE,Variables are a way of passing information from the shell to programs when you run them. Programs look "in the environment" for particular variables and if they are found will use the values stored. Standard UNIX variables are split into two categories, environment variables and shell variables. In broad terms, shell variables apply only to the current instance of the shell and are used to set short-term working conditions; environment variables have a farther reaching significance, and those set at login are valid for the duration of the session.By convention, environment variables have UPPER CASE and shell variables have lower case names. |
308 These variables like HOME, OSTYPE,Variables are a way of passing information from the shell to programs when you run them. Programs look "in the environment" for particular variables and if they are found will use the values stored. Standard UNIX variables are split into two categories, environment variables and shell variables. In broad terms, shell variables apply only to the current instance of the shell and are used to set short-term working conditions; environment variables have a farther reaching significance, and those set at login are valid for the duration of the session.By convention, environment variables have UPPER CASE and shell variables have lower case names. |
341 |
309 |
342 Some of examples of Environment variables are(result may vary!): :: |
310 Some of examples of Environment variables are: :: |
343 |
311 |
344 $ echo $OSTYPE |
312 $ echo $OSTYPE |
345 linux-gnu |
313 linux-gnu |
346 $ echo $HOME |
314 $ echo $HOME |
347 /home/baali |
315 /home/user |
348 |
316 |
349 To see all the variables and there values use any of following commands: :: |
317 To see all the variables and there values use any of following commands: :: |
350 |
318 |
351 $ printenv | less |
319 $ printenv | less |
352 $ env |
320 $ env |
360 $ set repo = $HOME/Desktop/random/code |
328 $ set repo = $HOME/Desktop/random/code |
361 $ cd $repo |
329 $ cd $repo |
362 |
330 |
363 *set* command is used to define a variable for the current shell. Try opening a new shell and use the above mentioned command, it wont work as expected. The other child process wont be able to see these variables unless we *export* them. Repeat the above mentioned activity with *export* command. Now with all new shells, *$repo* will work. |
331 *set* command is used to define a variable for the current shell. Try opening a new shell and use the above mentioned command, it wont work as expected. The other child process wont be able to see these variables unless we *export* them. Repeat the above mentioned activity with *export* command. Now with all new shells, *$repo* will work. |
364 |
332 |
365 Again these changes are limited to current session. To make them permanent or get loaded each time you log in, just add those lines to *.bashrc* file. |
333 |
366 |
334 |
367 Further Reading: |
|
368 ---------------- |
|
369 |
|
370 * http://lowfatlinux.com/linux-environment-variables.html |
|
371 * http://www.codecoffee.com/tipsforlinux/articles/030.html |
|
372 * http://www.ee.surrey.ac.uk/Teaching/Unix/unix8.html |
|
373 * http://en.wikipedia.org/wiki/Environment_variable |
|
374 |
335 |
375 |
336 |
376 Shell Scripting: |
337 Shell Scripting: |
377 ================ |
338 ================ |
378 |
339 |
380 ------- |
341 ------- |
381 |
342 |
382 Shell program or shell script,a sequence of commands to a text file and tell the shell to execute the text file instead of entering the commands. The first *"Hello World"* sample for shell scripting is as easy as it sounds: :: |
343 Shell program or shell script,a sequence of commands to a text file and tell the shell to execute the text file instead of entering the commands. The first *"Hello World"* sample for shell scripting is as easy as it sounds: :: |
383 |
344 |
384 $ echo '#!/bin/sh' > my-script.sh |
345 $ echo '#!/bin/sh' > my-script.sh |
385 $ clear >> my-script.sh |
346 $ echo 'clear' >> my-script.sh |
386 $ echo 'echo Hello World' >> my-script.sh |
347 $ echo 'echo Hello World' >> my-script.sh |
387 $ chmod 755 my-script.sh |
348 $ chmod 755 my-script.sh |
388 $ ./my-script.sh |
349 $ ./my-script.sh |
389 Hello World |
350 Hello World |
390 |
351 |
406 echo $name |
367 echo $name |
407 last_modified=`stat -c %y $name| cut -f 1 -d " "` |
368 last_modified=`stat -c %y $name| cut -f 1 -d " "` |
408 echo "Last modified: $last_modified" |
369 echo "Last modified: $last_modified" |
409 $ ./search.sh fname |
370 $ ./search.sh fname |
410 |
371 |
411 Try giving some file you want to search in place of fname. Please note in second line *`* its a back-quote(other key mapped with tilda), it is specifically used to get the output of one command into a variable. In this particular case name is a User defined variables (UDV) which stores the value. We access value stored in any variable using *$* symbol before name of variable. |
372 Try giving some file you want to search in place of fname. Please note in second line *`* its a back-quote(other key mapped with tilda), it is specifically used to get the output of one command into a variable. In this particular case name is a User defined variables which stores the value. We access value stored in any variable using *$* symbol before name of variable. |
412 |
373 |
413 naming conventions for variables?? do we need them?? |
374 |
414 |
375 |
415 Shell Arithmetic: |
376 Shell Arithmetic: |
416 ----------------- |
377 ----------------- |
417 |
378 |
418 Shell also provides support for basic arithmetic operations. The syntax is: :: |
379 Shell also provides support for basic arithmetic operations. The syntax is: :: |
481 echo "$0 : You must give one integer" |
442 echo "$0 : You must give one integer" |
482 exit 1 |
443 exit 1 |
483 fi |
444 fi |
484 fi |
445 fi |
485 |
446 |
486 One important thing to not in shell script is spacing, with many comparison and evaluation operation a wrongly placed space will spoil all the fun. So in previous example the expression *[ $# -eq 0 ]* will work properly, but if we remove those leading or trailing spaces like *[ $# -eq 0]*, it wont work as expected, or rather throw a warning. Both *test* and *[]* do the same task of testing a expression and returning true or false. |
447 One important thing to note in shell script is spacing, with many comparison and evaluation operation a wrongly placed space will spoil all the fun. So in previous example the expression *[ $# -eq 0 ]* will work properly, but if we remove those leading or trailing spaces like *[ $# -eq 0]*, it wont work as expected, or rather throw a warning. Both *test* and *[]* do the same task of testing a expression and returning true or false. |
487 |
448 |
488 Lets create something interesting using these if-else clause. Now we will create a script which will greet the user when he opens the shell. We will create the script, change the permission to make it executable and append the *.bashrc* file with *./greet.sh* line and we are done. The script is: :: |
449 Lets create something interesting using these if-else clause. Now we will create a script which will greet the user when he opens the shell. We will create the script, change the permission to make it executable and append the *.bashrc* file with *./greet.sh* line and we are done. The script is: :: |
489 |
450 |
490 #!/bin/sh |
451 #!/bin/sh |
491 #Script to greet the user according to time of day |
452 #Script to greet the user according to time of day |
556 do |
517 do |
557 j=$(echo "$i"|grep -o "[A-Za-z'&. ]*.mp3") |
518 j=$(echo "$i"|grep -o "[A-Za-z'&. ]*.mp3") |
558 echo "$i -> $j" |
519 echo "$i -> $j" |
559 done |
520 done |
560 |
521 |
561 Now we just replace the echo command with a ``mv`` or a ``cp`` command. |
522 Now we just replace the echo command with a ``mv`` command. |
562 :: |
523 :: |
563 |
524 |
564 for i in *.mp3 |
525 for i in *.mp3 |
565 do |
526 do |
566 j=$(echo "$i"|grep -o "[A-Za-z'&. ]*.mp3") |
527 j=$(echo "$i"|grep -o "[A-Za-z'&. ]*.mp3") |
567 cp "$i" "$j" |
528 mv "$i" "$j" |
568 done |
529 done |
569 |
530 |
570 As an exercise, you could try sorting the files in reverse alphabetical order and then prefix numbers to each of the filenames. |
531 |
571 |
532 |
572 ``while`` |
533 ``while`` |
573 ~~~~~~~~~ |
534 ~~~~~~~~~ |
574 |
535 |
575 The ``while`` command allows us to continuously execute a block of commands until the command that is controlling the loop is executing successfully. |
536 The ``while`` command allows us to continuously execute a block of commands until the command that is controlling the loop is executing successfully. |
644 |
605 |
645 |
606 |
646 Functions |
607 Functions |
647 --------- |
608 --------- |
648 |
609 |
649 When a group of commands are repeatedly being used within a script, it is convenient to group them as a function. This saves a lot of time and you can avoid retyping the code again and again. Also, it will help you maintain your code easily. Let's see how we can define a simple function, ``hello-world``. Functions can be defined in bash, either using the ``function`` built-in followed by the function name or just the function name followed by a pair of parentheses. |
610 When a group of commands are repeatedly being used within a script, it is convenient to group them as a function. This saves a lot of time and you can avoid retyping the code again and again. Also, it will help you maintain your code easily. Let's see how we can define a simple function, ``hello-world``. Function can be defined by using function name followed by a pair of parentheses. |
650 :: |
611 :: |
651 |
612 |
652 function hello-world |
613 |
653 { |
|
654 echo "Hello, World."; |
|
655 } |
|
656 |
|
657 hello-world () { |
614 hello-world () { |
658 echo "Hello, World."; |
615 echo "Hello, World."; |
659 } |
616 } |
660 |
617 |
661 $ hello-world |
618 $ hello-world |
662 Hello, World. |
619 Hello, World. |
663 |
620 |
664 Passing parameters to functions is similar to passing them to scripts. |
621 Passing parameters to functions is similar to passing them to scripts. |
665 :: |
622 :: |
666 |
623 |
667 function hello-name |
624 |
|
625 #! /bin/bash |
|
626 |
|
627 hello-name() |
|
628 { |
|
629 echo "hello ". $1 |
|
630 |
|
631 } |
|
632 |
|
633 hello-name $1 |
|
634 |
|
635 |
|
636 #!usr/bin/bash |
|
637 hello-name |
668 { |
638 { |
669 echo "Hello, $1."; |
639 echo "Hello, $1."; |
670 } |
640 } |
671 |
641 |
672 $ hello-name 9 |
642 hello-name $1 |
|
643 |
|
644 save this in a file helloscipt.sh and give it execute permission |
|
645 |
|
646 |
|
647 $ ./helloscipt 9 |
673 Hello, 9. |
648 Hello, 9. |
674 |
649 |
675 Any variables that you define within a function, will be added to the global namespace. If you wish to define variables that are restricted to the scope of the function, define a variable using the ``local`` built-in command of bash. |
650 Any variables that you define within a function, will be added to the global namespace. If you wish to define variables that are restricted to the scope of the function, define a variable using the ``local`` built-in command of bash. |
676 |
651 |
|
652 |
677 We shall now write a function for the word frequency generating script that we had looked at in the previous session. |
653 We shall now write a function for the word frequency generating script that we had looked at in the previous session. |
678 |
654 |
679 :: |
655 :: |
680 |
656 |
681 function word_frequency { |
657 word_frequency() { |
682 if [ $# -ne 1 ] |
658 if [ $# -ne 1 ] |
683 then |
659 then |
684 echo "Usage: $0 file_name" |
660 echo "Usage: $0 file_name" |
685 exit 1 |
661 exit 1 |
686 else |
662 else |