Lately I've been working on a lot of automation and monitoring projects, a big part of these projects are taking existing scripts and modifying them to be useful for automation and monitoring tools. One thing I have noticed is sometimes scripts use exit codes and sometimes they don't. It seems like exit codes are easy for poeple to forget, but they are an incredibly important part of any script. Especially if that script is used for the command line.
What are exit codes?
On Unix and Linux systems, programs can pass a value to their parent process while terminating. This value is referred to as an exit code or exit status. On POSIX systems the standard convention is for the program to pass
0 for successful executions and
1 or higher for failed executions.
Why is this important? If you look at exit codes in the context of scripts written to be used for the command line the answer is very simple. Any script that is useful in some fashion will inevitably be either used in another script, or wrapped with a bash one liner. This becomes especially true if the script is used with automation tools like SaltStack or monitoring tools like Nagios, these programs will execute scripts and check the status code to determine whether that script was successful or not.
On top of those reasons, exit codes exist within your scripts even if you don't define them. By not defining proper exit codes you could be falsely reporting successful executions which can cause issues depending on what the script does.
What happens if I don't specify an exit code
In Linux any script run from the command line has an exit code. With Bash scripts, if the exit code is not specified in the script itself the exit code used will be the exit code of the last command run. To help explain exit codes a little better we are going to use a quick sample script.
#!/bin/bash touch /root/test echo created file
The above sample script will execute both the
touch command and the
echo command. When we execute this script (as a non-root user) the touch command will fail, ideally since the touch command failed we would want the exit code of the script to indicate failure with an appropriate exit code. To check the exit code we can simply print the
$? special variable in bash. This variable will print the exit code of the last run command.
$ ./tmp.sh touch: cannot touch '/root/test': Permission denied created file $ echo $? 0
As you can see after running the
./tmp.sh command the exit code was
0 which indicates success, even though the touch command failed. The sample script runs two commands
echo, since we did not specify an exit code the script exits with the exit code of the last run command. In this case, the last run command is the
echo command, which did execute successfully.
#!/bin/bash touch /root/test
If we remove the
echo command from the script we should see the exit code of the
$ ./tmp.sh touch: cannot touch '/root/test': Permission denied $ echo $? 1
As you can see, since the last command run was
touch the exit code reflects the true status of the script; failed.
Using exit codes in your bash scripts
While removing the
echo command from our sample script worked to provide an exit code, what happens when we want to perform one action if the
touch was successful and another if it was not. Actions such as printing to
stdout on success and
stderr on failure.
Testing for exit codes
Earlier we used the
$? special variable to print the exit code of the script. We can also use this variable within our script to test if the
touch command was successful or not.
#!/bin/bash touch /root/test 2> /dev/null if [ $? -eq 0 ] then echo "Successfully created file" else echo "Could not create file" >&2 fi
In the above revision of our sample script; if the exit code for
0 the script will
echo a successful message. If the exit code is anything other than
0 this indicates failure and the script will
echo a failure message to
$ ./tmp.sh Could not create file
Providing your own exit code
While the above revision will provide an error message if the
touch command fails, it still provides a
0 exit code indicating success.
$ ./tmp.sh Could not create file $ echo $? 0
Since the script failed, it would not be a good idea to pass a successful exit code to any other program executing this script. To add our own exit code to this script, we can simply use the
#!/bin/bash touch /root/test 2> /dev/null if [ $? -eq 0 ] then echo "Successfully created file" exit 0 else echo "Could not create file" >&2 exit 1 fi
exit command in this script, we will exit with a successful message and
0 exit code if the
touch command is successful. If the
touch command fails however, we will print a failure message to
stderr and exit with a
1 value which indicates failure.
$ ./tmp.sh Could not create file $ echo $? 1
Using exit codes on the command line
Now that our script is able to tell both users and programs whether it finished successfully or unsuccessfully we can use this script with other administration tools or simply use it with bash one liners.
Bash One Liner:
$ ./tmp.sh && echo "bam" || (sudo ./tmp.sh && echo "bam" || echo "fail") Could not create file Successfully created file bam
The above grouping of commands use what is called list constructs in bash. List constructs allow you to chain commands together with simple
&& for and and
|| for or conditions. The above command will execute the
./tmp.sh script, and if the exit code is
0 the command
echo "bam" will be executed. If the exit code of
1 however, the commands within the parenthesis will be executed next. Within the parenthesis the commands are chained together using the
|| constructs again.
The list constructs use exit codes to understand whether a command has successfully executed or not. If scripts do not properly use exit codes, any user of those scripts who use more advanced commands such as list constructs will get unexpected results on failures.
More exit codes
exit command in bash accepts integers from
0 - 255, in most cases
1 will suffice however there are other reserved exit codes that can be used for more specific errors. The Linux Documentation Project has a pretty good table of reserved exit codes and what they are used for.
Recently Benjamin published his first book; Red Hat Enterprise Linux Troubleshooting Guide. In addition to writing, he has several Open Source projects focused on making Ops easier. These projects include Automatron, a project enabling auto-healing infrastructure for the masses.
Identify, capture and resolve common issues faced by Red Hat Enterprise Linux administrators using best practices and advanced troubleshooting techniques
What people are saying:
Excellent, excellent resource for practical guidance on how to troubleshoot a wide variety of problems on Red Hat Linux. I particularly enjoyed how the author made sure to provide solid background and practical examples. I have a lot of experience on Red Hat but still came away with some great practical tools to add to my toolkit. - Amazon Review