Some shell script tricks

Even if I like graphical interfaces, the shell is probably the tool I use most frequently for interacting with a computer.

On Windows it’s PowerShell, or bash if Cygwin is available. On a GNU/Linux machines it’s bash too, or, if I have enough permissions to install it and set as default interactive shell, zsh.

The biggest advantage of the shell (especially on GNU/Linux), is the ability to automate and compose tasks.

I’m not a great bash scripter, I have to check most of the time the syntax for getting a loop and other constructs correctly, but I’ve collected some useful tricks and hints for writing more robust shell scripts.

exit on error by default

It’s easy to forget to check for errors, consider

cd $1
# do other things

What if $1 is not a directory?

In it’s current form, the cd command might fail, and the line after that statement it will get executed. This can have nasty side effects, like deleting or creating files inside the wrong directory.

Instead of checking every return code of every command, which would increase the complexity and reduce the readability of the script, it is possible to stop and exit immediately on the first failure:

set -o errexit

# short form:
# set -e

This option is available on POSIX sh, which means it is available on very shell that claims to be compatible: bash, dash, zsh, …​

Of course there are situations where we can and want to handle errors, the documentation states it clearly:

The -e setting shall be ignored when executing the compound list following the while, until, if, or elif reserved word, a pipeline beginning with the ! reserved word, or any command of an AND-OR list other than the last.

Knowing that on every error the scripts terminates

  • enforces us to think clearly what commands can fail and what failures we want or can handle

  • makes it easier to find bugs while developing the script

If you are using bash instead of POSIX sh, you might also want to enable following option: set -o pipefail Unfortunately it is not available on all shells, and there is no portable solution .

static analysis

shellcheck is a must have when writing some scripts.

It does not support zsh, but you probably should prefer writing scripts in bash or in sh, given that those are available on nearly all systems, and given that zsh does not really offer great benefits over bash for non interactive shells (ie: scripts).

It is a static analyzer, thus it does no need to execute the script for finding common pitfalls.

Consider again

cd $1
# do other things

What if it is a directory but it contains spaces? Then the cd command will be execute with multiple parameters and probably fail.

What if the directory name contains *? The shell might expand it to other characters, thus cd might change to the wrong directory.

What if the directory is named -? While it is difficult to foresee the third problem, the first two have easy fixes, but lets do not digress.

For example on the previous sample it would have mentioned that we should write cd "$1" instead of cd $1 to avoid issues if $1 holds a valid path with spaces, or to avoid globbing.

distinguish between unset and set but empty parameters

When dealing with optional parameters, dealing with unset and set but empty variables is often confusing.

Adding

set -o nounset

# short form:
# set -u

Will help spotting some edge cases, since it will trigger an error, and together with set -o errexit the script will stop immediately.

Parameter expansion

In my experience, a not-so-well-known (or used) feature of the shell is parameter expansion.

Use it for default values

Following expressions are very useful when dealing with optional and default values:

unset var
printf %s "${var:-default value}"
# prints: "default value"

var= # or var=''
printf '%s\n' "${var:-default value}"
# prints: "default value"

var=value
printf '%s\n' "${var:-default value}"
# prints: "value"


unset var
printf '%s\n' "${var-default value}"
# prints: "default value"

var= # or var=''
printf '%s\n' "${var-default value}"
# prints: ""

var=value
printf '%s\n' "${var-default value}"
# prints: "value"

Therefore, leaving out the colon changes the test from "unset or empty" to just a test for "unset". This applies to the :-, :=, :?, and :+ forms of parameter expansion as well.

For example, define default number of parallel jobs (supposing $1 is the relevant parameter in the shell script):

DEFAULT_NUM_SLAVES="$(($(grep -c processor /proc/cpuinfo)-1))"
NUM_SLAVES="${1-$DEFAULT_NUM_SLAVES}"

Notice that it is not necessary to quote the default value (same holds for a variable) inside the brackets(!), it can be done, but most times in those cases it hurts readability.

Use it for testing if a variable is defined/set

Sometimes, depending if a variable is set or not, we want to make something, instead of using a default value:

if [ -n "${TMUX+x}" ]; then
  # inside tmux session, do something meaningful
fi

If $TMUX is unset, then ${TMUX+x} evaluates to nothing, otherwise it substitutes the string x (can be any string).

Use it for extracting substrings

A common use case for substring is extracting file name and or extension from a string that represents a path:

my_path=... # might exist or not
filename="${my_path##*/}" # as alternative to: filename="$(basename -- "$myfile")"
last_extension="${my_path##*.}"
filename_without_last_extension="${filename%.*}"

Check if string contains another given string

if [ "${string#*"$substring"}" != "$string" ]; then
   # $substring is in $string
else
   # $substring is not in $string
fi

Handling arguments at fixed positions

Using shell expansion is the easiest way to provide default values:

PARAM1="${1-default value}"

For something more complex, test if the value is set

if [ -z "${1+x}" ]; then
  # $1 not set
  # set to default value, trigger an error, print a message/warning, ...
fi

Prefer long arguments to short when writing script

Sure, it’s nice to write ls -a instead of ls --all, but for more exotic commands (do you remember all tar options?), writing down the long form improves readability. And since debugging shell scripts is not that nice, those should be as easy as possible to read.

I normally do not know all the options for a given command. Searching in the man page, or online with a search engine, for -a will normally give less meaningful results than searching for --all, especially for more esoteric commands.

For commands written directly in the console and executed immediately, it does not make much sense to prefer the long form to the short one if both are known at the given time. But scripts might be read by other people or proper self in a week, so they should be much more self-explanatory.

Avoid confusing filenames

As in the previous example, if you have a directory named -, you cannot simply cd to it, since cd - will bring you in your previous directory. If you do not have an absolute path, prepending ./ to any unknown relative path, or converting any unknown relative path to an absolute will fix the issue.

cd is not the only command or tool that has such problems, many command line programs accept arguments and directories as arguments, and if a directory name coincides with another argument, it get’s tricky to distinguish between those.

For example mkdir -la and mkdir ./-la creates the same directory, but ls -la and ls ./-la will do different things.

Therefore I strive to create "portable" names when creating files or directories. The POSIX specification defines a "Portable Filename Character Set" (not to be confused with the "Portable Character Set"). Long story short: only ascii letters, numbers, period, underscore and minus (hyphen) are accepted for a portable filename.

Notice that with this definition it is still possible to create a filename that begins with a minus, and could thus get confused with a command line option. It is therefore better to creating portable filenames that do not begin with a hyphen for simplicity.

On the other hand, spaces are not part of the "Portable Filename Character Set". Since it is such a common character all tools should be able to handle a file or path with a space in it.

Nevertheless we still have to deal with existing files, and those might have strange names. Prepending ./ (or converting the relative path to an absolute) fixes the problem in most cases, since most of the time only direcotires beginning with - are a problem. I’ve never saw, for example, a program using something beginning with ./ as argument and thus with possible issues between filenames and parameters.

Notice that also tools like realpath (for converting relative paths to absolute) can take arguments. On my machine, following commands fail; realpath -e. On the other hand realpath ./-e gets converted correctly.

Some tools gives the possibility to split arguments and paths with --. If such an option is available, use it, especially in a script, since it reduces a lot hidden bugs. For example realpath — -e works correctly.

Otherwise, try not to rely on anything, even on the encoding of filenames, and also the reason why the output of ls should not get parsed.

help and main functions

Some scripts are merely oneliners, whereas other are so complex that it makes sense to split it in functions.

Personally I like to have a main function (like in many other languages), this makes it easy to find the entry point, and eventually split the script in more modules. When executing the script, this line forward all arguments unchanged to the main function: main "$@"

Since the script is complex (otherwise no need to add a main function), main should call other functions that contains the business logic.

Another common function is help. It prints a message that explains what the script does, and what are possible options. This function is normally executed when the script is called with -h, --help or used incorrectly.

Changing IFS and other environment variables or global status

It is an accepted practice to change environments variable, but most of the times there is no need.

Changing the global status might have unpredictable side effects, therefore it is better to try to limit the scope as much as possible.

Instead of writing

var a="..."
export a
command_or_function
unset a

which might be problematic if a was already defined, we can write

var a="..." command_or_function

which will nearly do the same thing, and without changing the global scope.

Same holds for common global variables:

IFS=value command_or_function
LD_LIBRARY_PATH=value command_or_function

Otherwise it is possible to try to backup and restore the global status

old_IFS="${IFS}";
command_or_function;
IFS="${old_IFS}"

It surely works in simple cases, but with more complex logic resetting the old state is error prone. For example command_or_function might change IFS too and options like set -u could cause a failure when backing IFS up if it is not set.

Another global state is the current directory. Changing it, and changing it back is more tricky than necessary, if possible it should be avoided. If the user might ctrl+c the script, open a new pane in tmux, or simply because of an internal bug, the user it might find itself in the wrong place.

Many commands that accepts filenames normally works with paths, and those who do not might have a separate option for specifying the working directory.

For example, in git, instead of writing

cd "$path";
git ....
# other commands
cd -

it is possible to write git -C "$path" …​.

or, if -C is not be available because your git version is too old, it is always possible to create a subshell

(
  cd "$path";
  git ....
  # other commands
)

Avoiding bashism

Bash offers additional features compared to other POSIX-compliant shell, but most of them do not add as much value as for the interactive shell. The common denominator for all shells on UNIX systems should be the POSIX specification. The more a script sticks to it, the easier it is to use with different shells (for example also when copy-pasting commands), and with different versions of the same shell. It might also make your scripts more efficient

On windows it’s a different story, since the syntax of PowerShell, cmd and the POSIX specification are completely different. It’s unfortunate, but we can do nothing for it.

If you are unsure if a feature works on a POSIX-compliant shell, and are working on a debian-based system (or Cygwin), you could always try dash (it might be installed by default on other systems too). You can use dash -n to check that a script will run under dash without actually running it. It is not a perfect test, but should be good enough for many purposes.

Of course, if you are already using bash and need some feature it adds (normally it should not be the case), just use it, unless you already know your script needs to be compatible with other shells too.

Not using some feature of bash might mean to use an external tool, which might not be always available!

For example the parameter expansion "${parameter/pattern/string}" is a bashism. To replace it, it is possible to use external tools like sed, grep, and/or awk. Those are an external dependencyes, granted, they should be on all systems. Those where probably not necessary before, so it is not obvious if which dependency, the bash shell or those tools, are less of a burden.

Verify if a program is available

Normally my shell scripts just invoke external tools, eventually manipulate the outputs and call other tools. There is the hidden assumption that many "common" (mileage of common might vary) tools, like ls, find, …​ are available.

Sometimes a script needs to execute some tool that are normally not available, and the absence of such a tool is not necessarily an error. It might be an optional action, or an action that can be performed with another tool, thus we want to know if at least one of those is available.

For example, how can we check for example if objdump is available without executing it?

command should cover most use cases:

tool=...
if command -v "$tool" >/dev/null 2>&1; then
  # tool is available
fi

The nice thing about command is that it works with alias and functions too, so it handy to use in .bashrc too.

Of course it does not solve the problem if there exists different versions of the program that accepts different flags, maybe also depending on the platform or voendor of the tool. In that case the only option might be to execute the tool and see if the flag is accepted. Consider installing the most recent and portable version, it might avoid a lot of headaches.

Avoid complex scripts

When the script gets too complex, change scripting language.

sh is a very nice glue language; like all the others it has a lot of issues, and most of all, it is not very structured. It’s superb for trying something out, calling and putting together a couple of tools, but writing complex logic and error handling is more difficult compared to most other languages.

Because nearly all distributions have Python (hopefully python3 in the meantime) installed, this is most of the time my first choice when "upgrading" a scripts. While I do not like Python that much, and of course it has is issues too, it makes possible to write more structured code, and still feels like a shell language. It is more verbose when calling external tools, but probably some of those tools can be replaced with library functions, which also makes your code more simple, more portable and probably easier to maintain.

In Unix-like systems, executing a python script or a shell script is the same action. Through the shebang the system knows how to interpret the file.

Thus we can upgrade the script from one language to another without even changing the file name. Unfortunately this is not possible on Windows, but we can always replace the old script with a new that calls, for example, the python script. This way it should be possible to avoid some transition issue if those scripts are deployed on multiple machines.