[EnglishFrontPage] [TitleIndex] [WordIndex

<- Job Control


Practices

1. Choose Your Shell

The first thing you should do before starting a shell script, or any kind of script or program for that matter, is enumerate the requirements and the goal of that script. Then evaluate what the best tool is to accomplish those goals.

BASH may be easy to learn and write in, but it isn't always fit for the job.

There are a lot of tools in the basic toolset that can help you. If you just need AWK, then don't make a shell script that invokes AWK. Just make an AWK script. If you need to retrieve data from an HTML or XML file in a reliable manner, Bash is also the wrong tool for the job. You should consider XPath/XSLT instead, or a language that has a library available for parsing XML or HTML.

If you decide that a shell script is what you want, then first ask yourself these questions:

If the above questions do not limit your choices, use all the Bash features you require, and then make note of which version of Bash is required to run your script.

Using Bash 3 or higher means you can avoid ancient scripting techniques. They have been replaced with far better alternatives for very good reasons.

2. Quoting

Word Splitting is the demon inside BASH that is out to get unsuspecting newcomers or even veterans who let down their guard.

If you do not understand how Word Splitting works or when it is applied you should be very careful with your strings and your Parameter Expansions. I suggest you read up on WordSplitting if you doubt your knowledge.

The best way to protect yourself from this beast is to quote all your strings. Quotes keep your strings in one piece and prevent Word Splitting from tearing them open. Allow me to illustrate:

    $ echo Push that word             away from me.
    Push that word away from me.
    $ echo "Push that word             away from me."
    Push that word             away from me.

Now, don't think Word Splitting is about collapsing spaces. What really happened in this example is that the first command passed each word in our sentence as a separate argument to echo. BASH split our sentence up in words using the whitespace between them to determine where each argument begins and ends. In the second example BASH is forced to keep the whole quoted string together. This means it's not split up into arguments and the whole string is passed to echo as one argument. echo prints out each argument it gets with a space in between them. You should understand the basics of Word Splitting now.

This is where it gets dangerous: Word Splitting does not just happen on literal strings. It also happens after Parameter Expansion! As a result, on a dull and tired day, you might just be stupid enough to make this mistake:

    $ sentence="Push that word             away from me."
    $ echo $sentence
    Push that word away from me.
    $ echo "$sentence"
    Push that word             away from me.

As you can see, in the first echo command, we were negligent and left out the quotes. That was a mistake. BASH expanded our sentence and then used Word Splitting to split the resulting expansion up into arguments to use for echo. In our second example, the quotes around the Parameter Expansion of sentence make sure BASH does not split it up into multiple arguments around the whitespace.

It's not just spaces you need to protect. Word Splitting occurs on spaces, tabs and newlines, or whatever characters are in the IFS variable. Here's another example to show you just how badly you can break things if you neglect to use quotes:

    $ echo "$(ls -al)"
    total 8
    drwxr-xr-x   4 lhunath users 1 2007-06-28 13:13 "."/
    drwxr-xr-x 102 lhunath users 9 2007-06-28 13:13 ".."/
    -rw-r--r--   1 lhunath users 0 2007-06-28 13:13 "a"
    -rw-r--r--   1 lhunath users 0 2007-06-28 13:13 "b"
    -rw-r--r--   1 lhunath users 0 2007-06-28 13:13 "c"
    drwxr-xr-x   2 lhunath users 1 2007-06-28 13:13 "d"/
    drwxr-xr-x   2 lhunath users 1 2007-06-28 13:13 "e"/
    $ echo $(ls -al)
    total 8 drwxr-xr-x 4 lhunath users 1 2007-06-28 13:13 "."/ drwxr-xr-x 102 lhunath users 9 2007-06-28 13:13 ".."/ -rw-r--r-- 1 lhunath users 0 2007-06-28 13:13 "a" -rw-r--r-- 1 lhunath users 0 2007-06-28 13:13 "b" -rw-r--r-- 1 lhunath users 0 2007-06-28 13:13 "c" drwxr-xr-x 2 lhunath users 1 2007-06-28 13:13 "d"/ drwxr-xr-x 2 lhunath users 1 2007-06-28 13:13 "e"/

In some very rare occasions it may be desired to leave out the quotes. That's if you need Word Splitting to take place:

    $ friends="Marcus JJ Thomas Michelangelo"
    $ for friend in $friends
    > do echo "$friend is my friend!"; done
    Marcus is my friend!
    JJ is my friend!
    Thomas is my friend!
    Michelangelo is my friend!

But, honestly? You should use arrays for nearly all of these cases. Arrays have the benefit that they separate strings without the need for an explicit delimiter. That means your strings in the array can contain any valid (non-NUL) character, without the worry that it might be your string delimiter (like the space is in our example above). Using arrays in our example above gives us the ability to add middle or last names of our friends:

    $ friends=( "Marcus The Rich" "JJ The Short" "Timid Thomas" "Michelangelo The Mobster" )
    $ for friend in "${friends[@]}"
    > do echo "$friend is my friend!"; done

Note that in our previous for we used an unquoted $friends. This allowed BASH to split our friends string up into words. In this last example, we quoted the ${friends[@]} Parameter Expansion. Quoting an expansion of an array through the @ index makes BASH expand that array into a sequence of its elements, where each element is wrapped in quotes.

3. Readability

Almost as important as the result of your code is the readability of your code.

Chances are that you aren't just going to write a script once and then forget about it. If so, you might as well delete it after using it. If you plan to continue using it, you should also plan to continue maintaining it. Unlike your house your code won't get dirty over time, but you will learn new techniques and new approaches constantly. You will also gain insight about how your script is used. All that new information you gather since the completion of your initial code should be used to maintain your code in such a way that it constantly improves. Your code should keep growing more user-friendly and more stable.

To make it easier for yourself to keep your code healthy and improve it regularly you should keep an eye on the readability of what you write. When you return to a long loop after a year has passed since your last visit to it and you wish to improve it, add a feature, or debug something about it, believe me when I say you'd rather see this:

   1     friends=( "Marcus The Rich" "JJ The Short" "Timid Thomas" "Michelangelo The Mobster" )
   2 
   3     # Say something significant about my friends.
   4     for name in "${friends[@]}"; do
   5 
   6         # My first friend (in the list).
   7         if [[ $name = ${friends[0]} ]]; then
   8             echo $name was my first friend.
   9 
  10         # My friends those names start with M.
  11         elif [[ $name = M* ]]; then
  12             echo "$name starts with an M"
  13 
  14         # My short friends.
  15         elif [[ " $name " = *" Short "* ]]; then
  16             echo "$name is a shorty."
  17 
  18         # Friends I kind of didn't bother to remember.
  19         else
  20             echo "I kind of forgot what $name is like."
  21 
  22         fi
  23     done

Than be confronted with something like this:

   1     x=(       Marcus\ The\ Rich JJ\ The\ Short
   2       Timid\ Thomas Michelangelo\ The\ Mobster)
   3     for name in "${x[@]}"
   4       do if [ "$name" = "$x" ]; then echo $name was my first friend.
   5      elif
   6        echo $name    |   \
   7       grep -qw Short
   8         then echo $name is a shorty.
   9      elif [ "x${name:0:1}" = "xM" ]
  10          then echo $name starts   with an M; else
  11     echo I kind of forgot what $name \
  12      is like.; fi; done

And yes, I know this is an exaggerated example, but I've seen some authentic code that actually has a lot in common with that last example.

For your own health, keep these few points in mind:

4. Bash Tests

The test command, also known as [, is an application that usually resides somewhere in /usr/bin or /bin and is used a lot by shell programmers to perform certain tests on files and variables. In a number of shells, including Bash, test is also implemented as a shell builtin.

It can produce surprising results, especially for people starting shell scripting that think [ ] is part of the shell syntax.

If you use sh, you have little choice but to use test as it is the only way to do most of your testing.

If however you are using Bash to do your scripting (and I presume you are since you're reading this guide), then you can also use the [[ keyword. While it still behaves in many ways like a command, it presents several advantages over the traditional test command.

Let me illustrate how [[ can be used to replace test, and how it can help you to avoid some of the common mistakes made by using test:

    $ var=''
    $ [ $var = '' ] && echo True
    -bash: [: =: unary operator expected
    $ [ "$var" = '' ] && echo True
    True
    $ [[ $var = '' ]] && echo True
    True

[ $var = '' ] expands into [ = '' ]. The first thing test does is count its arguments. Since we're using the [ form, we'll just strip off the mandatory ] argument at the end. In the first example, test sees two arguments: = and ''. It knows that if it has two arguments, the first one has to be a unary operator (an operator that takes one operand). But = is not a unary operator (it's binary -- it requires two operands), so test blows up.

Yes, test did not see our empty $var because BASH expanded it into nothingness before test could even see it. Moral of the story? Use more quotes! Using quotes, [ "$var" = '' ] expands into [ "" = '' ] and test has no problem.

Now, [[ can see the whole command before it's being expanded. It sees $var, and not the expansion of $var. As a result, there is no need for the quotes at all! [[ is safer.

    $ var=
    $ [ "$var" < a ] && echo True
    -bash: a: No such file or directory
    $ [ "$var" \< a ] && echo True
    True
    $ [[ $var < a ]] && echo True
    True

In this example we attempted a string comparison between an empty variable and 'a'. We're surprised to see the first attempt does not yield True even though we think it should. Instead, we get some weird error that implies BASH is trying to open a file called 'a'.

We've been bitten by File Redirection. Since test is just an application, the < character in our command is interpreted (as it should) as a File Redirection operator instead of the string comparison operator of test. BASH is instructed to open a file 'a' and connect it to stdin for reading. To prevent this, we need to escape < so that test receives the operator rather than BASH. This makes our second attempt work.

Using [[ we can avoid the mess altogether. [[ sees the < operator before BASH gets to use it for Redirection -- problem fixed. Once again, [[ is safer.

Even more dangerous is using the > operator instead of our previous example with the < operator. Since > triggers output Redirection it will create a file called 'a'. As a result, there will be no error message warning us that we've committed a sin! Instead, our script will just break. Even worse, we might overwrite some important file! It's up to us to guess where the problem is:

    $ var=a
    $ [ "$var" > b ] && echo True || echo False
    True
    $ [[ "$var" > b ]] && echo True || echo False
    False

Two different results, great. Trust me when I say you can always trust [[ more than [. [ "$var" > b ] is expanded into [ "a" ] and the output of that is being redirected into a new file called 'b'. Since [ "a" ] is the same as [ -n "a" ] and that basically tests whether the "a" string is non-empty, the result is a success and the echo True is executed.

Using [[ we get our expected scenario where "a" is tested against "b" and since we all know "a" sorts before "b" this triggers the echo False statement. And this is how you can break your script without realizing it. You will however have a suspiciously empty file called 'b' in your current directory.

So believe me when I say, [[ is safer than [. Because everybody inevitably makes programming errors. People usually don't intend to introduce bugs in their code. It just happens. So don't pretend you can use [ and "You'll be careful not to make these mistakes", because I can assure you that you will.

Besides [[ provides the following features over [:

The only advantage of test is its portability.

5. Don't Ever Do These

The Bash shell allows you to do quite a lot of things, offering you considerable flexibility. Unfortunately, it does very little to discourage misuse and other ill-advised behavior. It hopes people will find out for themselves that certain things should be avoided at all costs.

Unfortunately many people don't care enough to want to find out for themselves. They write without thinking things through and many awful and dangerous scripts end up in production environments or in Linux distributions. The result of these, and even your very own scripts written in a time of neglect, can often be DISASTROUS.

That said, for the good of your scripts and for the rest of mankind, Never Ever Do Anything Along These Lines:

6. Debugging

Very often you will find yourself clueless as to why your script isn't acting the way you want it to. Resolving this problem is always just a matter of common sense and debugging techniques.


Diagnose The Problem

Unless you know what exactly the problem is, you most likely won't come up with a solution anytime soon. So make sure you understand what exactly goes wrong. Evaluate the symptoms and/or error messages.

Try to formulate the problem as a sentence. This will also be vital if you're going to ask other people for help with your problem. You don't want them to have to go through your whole script or run it so that they understand what's going on. No; you need to make the problem perfectly clear to yourself and to anybody trying to help you. This requirement stands until the day the human race invents means of telepathy.


Minimize The Codebase

If staring at your code doesn't give you a divine inspiration, the next thing you should do is try to minimize your codebase to isolate the problem.

Don't worry about preserving the functionality of your script. The only thing you want to preserve is the logic of the code block that seems buggy.

Often, the best way to do this is to copy your script to a new file and start deleting everything that seems irrelevant from it. Alternatively, you can make a new script that does something similar in the same code fashion, and keep adding structure until you duplicate the problem.

As soon as you delete something that makes the problem go away (or add something that makes it appear), you'll have found where the problem lies. Even if you haven't precisely pinpointed the issue, at least you're not staring at a massive script anymore, but hopefully at a stub of no more than 3-7 lines.

For example, if you have a script that lets you browse images in your image folder by date, and for some reason you can't manage to iterate over your images in the folder properly, it suffices to reduce the script to this part:

    for image in $(ls -R "$imgFolder"); do
        echo "$image"
    done

Your actual script will be far more complex, and the inside of the for loop will also be far longer. But the essence of the problem is this code. Once you've reduced your problem to this it may be easier to see the problem you're facing. Your echo spits out parts of image names; it looks like all whitespace is replaced by newlines. That must be because echo is run once for each chunk terminated by whitespace, not for every image name (as a result, it seems the output has split open image names with whitespace in them). With this reduced code, it's easier to see that the cause is actually your for statement that splits up the output of ls into words. That's because ls is UNPARSABLE in a bugless manner (do not ever use ls in scripts, unless if you want to show its output to a user).

We can't use a recursive glob (unless we're in bash 4), so we have to use find to retrieve the filenames. One fix would be:

    find "$imgFolder" -print0 | while IFS= read -r -d '' image; do
        echo "$image"
    done

Now that you've fixed the problem in this tiny example, it's easy to merge it back into the original script.


Activate BASH's Debug Mode

If you still don't see the error of your ways, BASH's debugging mode might help you see the problem through the code.

When BASH runs with the x option turned on, it prints out every command it executes before executing it (to standard error). That is, after any expansions have been applied. As a result, you can see exactly what's happening as each line in your code is executed. Pay very close attention to the quoting used. BASH uses quotes to show you exactly which strings are passed as a single argument.

There are three ways of turning on this mode.

Because the debugging output goes to stderr, you will generally see it on the screen, if you are running the script in a terminal. If you would like to log it to a file, you can tell Bash to send all stderr to a file:

exec 2>> /path/to/my.logfile
set -x

A nice feature of bash version >= 4.1 is the variable BASH_XTRACEFD. This allows you to specify the file descriptor to write the set -x debugging output to. In older versions of bash, this output always goes to stderr, and it is difficult if not impossible to keep it separate from normal output (especially if you are logging stderr to a file, but you need to see it on the screen to operate the program). Here's a nice way to use it:

    # dump set -x data to a file
    # turns on with a filename as $1
    # turns off with no params
    # note that FD 4 should not be used elsewhere in the script
    setx_output()
    {
        if [[ $1 ]]; then 
           exec 4>>"$1"
           BASH_XTRACEFD=4
           set -x
        else
           set +x
           exec 4>&-
        fi
    }

If you have a complicated mess of scripts, you might find it helpful to change PS4 before setting -x:

      export PS4='+$BASH_SOURCE:$LINENO:$FUNCNAME: '


Step your code

If the script goes too fast for you, you can enable code-stepping. The following code uses the DEBUG trap to inform the user about what command is about to be executed and wait for his confirmation do to so. Put this code in your script, at the location you wish to begin stepping:

    trap '(read -p "[$BASH_SOURCE:$LINENO] $BASH_COMMAND?")' DEBUG


The BASH debugger

The Bash Debugger Project is a gdb-style debugger for bash, available from http://bashdb.sourceforge.net/

The Bash Debugger will allow you to walk through your code and help you track down bugs.


Reread The Manual

If your script still doesn't seem to agree with you, maybe your perception of the way things work is wrong. Try going back to the manual (or this guide) to re-evaluate whether commands do exactly what you think they do, or the syntax is what you think it is. Very often people misunderstand what for does, how Word Splitting works, or how often they should use quotes.

Keep the tips and good practice guidelines in this guide in mind as well. They often help you avoid bugs and problems with scripts.

I mentioned this in the Scripts section of this guide too, but it's worth repeating it here. First of all, make sure your script's header is actually #! /bin/bash. If it is missing or if it's something like #! /bin/sh then you deserve the problems you're having. That means you're probably not even using BASH to run your code. Obviously that'll cause issues. Also, make sure you have no Carriage Return characters at the ends of your lines. This is the cause of scripts written in Microsoft Windows(tm). You can get rid of these fairly easily like this:


Read the FAQ / Pitfalls

The BashFAQ and BashPitfalls pages explain common misconceptions and issues encountered by other BASH scripters. It's very likely that your problem will be described there in some shape or form.

To be able to find your problem in there, you'll obviously need to have Diagnosed it properly. You'll need to know what you're looking for.


Ask Us On IRC

There are people in the #bash channel almost 24/7. This channel resides on the freenode IRC network. To reach us, you need an IRC client. Connect it to irc.freenode.net, and /join #bash.

Make sure that you know what your real problem is and have stepped through it on paper, so you can explain it well. We don't like having to guess at things. Start by explaining what you're trying to do with your script.

Either way, please have a look at this page before entering #bash: XyProblem.


<- Job Control


2012-07-01 04:05