The Advent of CLI - Day 4

terminal_edit_code

Let’s continue with Programming

So we’re here and got this little piece of code running as a shell script
in our $HOME/bin directory

aa-project-create

#!/bin/bash
# create new project

current_dir() {
    echo "$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )"
}

# --- main ---

CURDIR=$(current_dir)
TARGET="$1"

if [[ -z "${TARGET}" ]]; then
    echo "project target is missing"
    echo "eg. $ ./${0##*/} projectname"
    exit 1
fi

if [ -d "$TARGET" ]; then
    echo "project directory already exists"
    exit 1
fi

# create project directory
mkdir ${TARGET}

# create sub directories
mkdir ${TARGET}/build
mkdir ${TARGET}/docs
mkdir ${TARGET}/src

# create files
touch ${TARGET}/README.md

# done
exit 0

For the keen eye you will notice I included couple of “mistakes”,
not really bugs as the program run, but more things you could see
as either useless or a bit strange.

this part is useless

current_dir() {
    echo "$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )"
}

and this part too

CURDIR=$(current_dir)

Simply put, nowhere in our script we use the $CURDIR variable.

But I did all that on purpose, off course.

Personally, I do use this little function current_dir in almost all of my scripts,
why? because it is pretty damn useful to find out the directory where is stored your
script so you can call other scripts stored in the same directory, because usually
you tend to group those scripts in the same place.

And yes we gonna analyse it :).

From the inside out:

  • first ${BASH_SOURCE[0]}
    get you the full path of the currently running script
  • then dirname "${something}"
    get you only the directory of that full path
  • then cd "${something}"
    navigate into that directory
    and if successful returns 0 (exit code)
    otherwise returns non-zero
  • and finally, only if the first command returns 0
    then the following command pwd is executed

Many things to learn here

  • there are special bash variables only available
    within your script, like $BASH_SOURCE
  • you can get the result of a command as string variable
    eg. $(command param1 param2 etc.)
    as a side notes you could also use backticks
    eg. \command param1 param2 etc.``
  • you can chain commands execution with &&
    $ command1 && command2 && command3
    but the next command only execute if the previous one returns 0

Learn about the different Shell Variables (from bash hackers)

Learn also about Environment Variables too, it may not be obvious but there available to your script too.

Read this article How To Read and Set Environmental and Shell Variables on a Linux VPS which does a pretty good job at explaining their differences

Environmental variables are variables that are defined for the current shell
and are inherited by any child shells or processes. Environmental variables are
used to pass information into processes that are spawned from the shell.

and

Shell variables are variables that are contained exclusively within the shell
in which they were set or defined. They are often used to keep track of ephemeral data,
like the current working directory.

Now if you want to list all the Environment Variables in your shell use
$ printenv

If you need to define permanently an env var (short for “Environment Variable”) use
$ export FOOBAR="hello world"

Note:
“permanently” means for the current shell, subshell and child processes.
If you open another terminal window, this env var will not be available.

If you need to temporarily define an env var while running a particular command use
$ env FOOBAR="hello world" command

NOTE:
do not close the env statement with a semi-colon ;
eg. $ env FOOBAR="hello world"; command
nor use the AND && statement
eg. $ env FOOBAR="hello world" && command
you HAVE TO follow the command right after the env declaration,
to know more see $ man env (eg. “run a program in a modified environment”).

If you need to remove or delete an env var use
$ export -n FOOBAR

So that’s about it about variables

  • positional parameters like $1, $2, $3, etc.
    allow to provide arguments to your shell scripts
  • you can use shell variables like $HOME, $SHELL, etc.
    those are predefined inside the current Bash shell
    let see them as “ephemeral data”
  • you can use environment variables like $TERM, $USER, etc.
    those are available through your “session” and inherited by child shells (eg. subshell)
    and child processes, but not necessarilly available in another “session”
  • if you do want to define environment variables permanently
    to be used in different “sessions” you will have to write them
    into one of the many config files
    • $ cat 'FOOBAR="hello world"' >> ~/,profile
      for all your own user sessions loading the config file .profile
    • $ cat 'FOOBAR="hello world"' >> /etc/profile
      for every single user sessions on the system
      I would say think twice about doing something like that
  • and finally your own user defined variables
    FOOBAR="hello world" in your shell script
    • with the option to make it available to child shells/processes
      with an export FOOBAR
    • and also remember the local variables defined inside functions
      for ex local message="$1"

And now we can fully explains what the following lines are doing

CURDIR=$(current_dir)
TARGET="$1"

Define the variable CURDIR and assign the return value of the function current_dir.

Define the variable TARGET and assign the 1st argument (eg $1) passed to the shell script.

And we consider then $CURDIR useless because we don’t reuse it anywhere (for now),
and by association that also make the function current_dir useless.


Now let’s talk about the other lines and so conditional statements.

This part

if [[ -z "${TARGET}" ]]; then
    echo "project target is missing"
    echo "eg. $ ./${0##*/} projectname"
    exit 1
fi

if [ -d "$TARGET" ]; then
    echo "project directory already exists"
    exit 1
fi

this part alone is not strange

if [[ -z "${TARGET}" ]]; then
    echo "project target is missing"
    echo "eg. $ ./${0##*/} projectname"
    exit 1
fi

this part alone is also not strange

if [ -d "$TARGET" ]; then
    echo "project directory already exists"
    exit 1
fi

the two parts above used together are a bit strange
because slightly different

one use
if [ condition ]

and the other use
if [[ condition ]]

but the result/behaviour is exactly the same …

First, let’s study a bit if statements, the syntax usually look like that

if commands1
then
   commands2
else
   commands3
fi

you can move back the then to the previous line if you use ; to separate the statements

if commands1; then
   commands2
else
   commands3
fi

and yeah you always close an if statement with a fi statement

Let’s also talk about the “else if” part

if commands1
then
   commands2
elif commands3
then
   commands4
else
   commands5
fi

and

if commands1; then
   commands2
elif commands3; then
   commands4
else
   commands5
fi

That’s why you do try to move the then on the previous line, imho it is more readable.

And wait a minute … I started talking about “conditional statements” but I show you syntax with commands ?

Oh that’s because each condition is basically a command that is supposed to return an exit code of 0 meaning true or success,
or any non-zero exit code meaning false or failure.

With [ and [[ (and few others) let’s see what is what for those conditions.

if [ condition ]

[ is POSIX, see test utility
it is a binary that you can find in /usr/bin/[ (or /bin/[) but is also defined as a builtin command in bash

simply put [ is a command with a strange name, and ] is an argument to this command that stop processing further arguments,
hence why [ arg ] is to be seen as command [ with params arg and ].

this is the more portable way to test for conditions because it will work in numerous shells and in particular sh (eg. #!/bin/sh).

if [[ condition ]]

[[ is a bash keyword (not a command) and a bash extension.

you can not use it in a sh script

here a list of what [[ can do (and [ can’t)

  • can do pattern matching
    eg. if [[ abc = ab? ]]
  • can test if a string matches a regular expression
    eg. if [[ $num =~ ^[0-9]{3}$ ]]
  • can do lexicographical comparison
    eg. if [[ abc < def ]]
  • can use && and ||
    eg. if [[ command1 && command2 ]]
    while with [ you will have to do this
    if [ command1 ] && [ command2 ]
  • can use ( to group expressions
    eg. if [[ (command1 || command2) && command3 ]]

if (( condition ))

Test for arithmetic condition, again a bash extension.

It returns an exit code of 0 (eg. true) if the arithmetic calculation is nonzero.

Yep, can be a bit confusing

  • so the arithmetic expression 1 return a 0 exit code
    eg. if (( 1 )) means true
  • and the arithmetic expression 0 return a non-zero exit code
    eg. if (( 0 )) means false

if (command)

Execute a command in a subshell and the if statement act accordingly to its exit code.

So what the use of a subshell?
see Grouping Commands

Placing a list of commands between parentheses causes a subshell environment to be created
(see Command Execution Environment),
and each of the commands in list to be executed in that subshell. Since the list is executed in a subshell,
variable assignments do not remain in effect after the subshell completes.

for example:

if cd mydir; then
    # we know that mydir exists and we entered it
    # create the file `myfile` into the `mydir` directory
    touch myfile
else
    echo "the directory 'mydir' does not exists"
fi

and

if (cd mydir); then
    # we know that mydir exists but the subshell enters it
    # create the file `myfile` into the current directory
    touch myfile
else
    echo "the directory 'mydir' does not exists"
fi

the difference is important

if command

Execute a command and the if statement act accordingly to its exit code.


see Conditional Constructs and Bash Conditional Expressions and Other Comparison Operators.

With [ condition ] and [[ condition ]] you use comparison operators like

integer comparaison

| operator | description                 |
|----------|-----------------------------|
| -eq      | is equal to                 |
| -ne      | is not equal to             |
| -gt      | is greater than             |
| -ge      | is greater than or equal to |
| -lt      | is less than                |
| -le      | is less than or equal to    |

With (( condition )) you use mathematical comparison operators like

integer comparaison

| operator | description                 |
|----------|-----------------------------|
| =        | is equal to                 |
| ==       | is equal to                 |
| !=       | is not equal to             |
| >        | is greater than             |
| >=       | is greater than or equal to |
| <        | is less than                |
| <=       | is less than or equal to    |

and, surprise surprise, with [[ condition ]] you can also use mathematical comparison operators to compare strings

string comparaison (in ASCII alphabetical order)

| operator | description                 |
|----------|-----------------------------|
| =        | is equal to                 |
| ==       | is equal to                 |
| !=       | is not equal to             |
| >        | is greater than             |
| >=       | is greater than or equal to |
| <        | is less than                |
| <=       | is less than or equal to    |

Let’s analyse this part

if [[ -z "${TARGET}" ]]; then
    echo "project target is missing"
    echo "eg. $ ./${0##*/} projectname"
    exit 1
fi
  • If the variable $TARGET length is zero
  • then display “project target is missing”
  • then take the shell variable ${0} (the current script filepath)
    and apply parameter substitution to get the basename
  • and so display a usage message
  • finally exit with the exit code 1

and the other part

if [ -d "$TARGET" ]; then
    echo "project directory already exists"
    exit 1
fi
  • if the filepath contained in variable $TARGET exists and is a directory
  • then display “project directory already exists”
  • finally exit with the exit code 1

In both case we could have used either [ condition ] or [[ condition ]],
the inconsistency here is to use both instead of one or the other.

Something like ${0##*/} is pretty cryptic syntax, to know that it exists imho is enough,
no need to master it completely (and some answers are available a google search away).


I will end on those little things

  • Parameter Expansion / Substitution
    see Parameter Substitution
    eg. ${}
    if you declare a variable ANIMAL="duck"
    sur you can use echo $ANIMAL and echo ${ANIMAL} feels overkill,
    but then there are case where you want to reuse this var
    in the middle of other strings, for ex: echo "one $ANIMAL, two ${ANIMAL}s"
    using echo "two $ANIMALs" would not work

  • Command substitution
    see Command substitution
    eg. $()
    this is directly related to the if (command)
    and yes there too the command is executed in a subshell
    and that’s pretty useful to assign the command output to a variable
    ex: FOOBAR=$(command arg1 arg2)

  • Arithmetic expansion
    see Arithmetic Expansion
    eg. $(())
    this is directly related to the if (( condition ))
    and allow you to assign the result of mathemathecal expressions to a variable
    ex: FOOBAR=$(( 12 + 3 ))
    now bash supports only integer calculation, not float
    to use floats you will have to rely on a external binary like bc (basic calculator)
    for ex: if (( $(echo "10.3 > 10.1" | bc -l) ))

few more links