A deeper dive into the shell

A laptop displays code in a terminal. Next to the laptop, three plants sit before a window.
Photo by Safar Safarov on Unsplash

What happens when you type a command like ls -l into the shell?

To better understand the shell, let’s dig into the ls -l command to see each step of its process.

When you first enter the shell, it prints a prompt. When you type in ls -l, the shell uses the getline function to allocate memory and copy your input to an array. Before it can do anything with your input, however, it needs some editing. Remember how you pressed enter to give you command to the shell? The getline function saved that newline character at the end of the string. To fix this, the shell first replaces the newline character with a null byte.

On top, a chart separates ls -l ‘\n’ into separate spaces. Below, the same chart is printed with ‘\0’ in place of ‘\n’
On top, a chart separates ls -l ‘\n’ into separate spaces. Below, the same chart is printed with ‘\0’ in place of ‘\n’

Once the array is up-to-date, the shell has to make sense of the input. First, it uses the strtok function to tokenize the string into an array of character pointers. (If you’re making your own shell, you can also use a linked list for this!) This means that the input is separated by spaces into separate strings.

Above, a chart shows “ls ‘\0’”. Below, a chart shows “-l ‘\0’”
Above, a chart shows “ls ‘\0’”. Below, a chart shows “-l ‘\0’”

Once the shell has separated the ls command, it compares it to all of the shell’s built-ins and aliases. When it doesn’t find a match, it then switches to searching for the function in the current path.

Before it can do this, the shell must first capture your PATH. First, it searches within your environment for where the PATH is located. It then copies the string of the PATH. Using the same strtok function as before, the shell parses the PATH string by colons in order to separate it into each location in the PATH. Like before, this is saved into an array of strings (or a linked list, if you’d like).

When we add in print statements to this process, we can see how the shell tokenizes the PATH:

$ ls -lYour path is [PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games]Path token is [/usr/local/sbin]Path token is [/usr/local/bin]Path token is [/usr/sbin]Path token is [/usr/bin]Path token is [/sbin]Path token is [/bin]Path token is [/usr/games]Path token is [/usr/local/games]

Once the shell has a tokenized version of the PATH, it uses the opendir function to open each part of the path and checks to see if a file within it matches your command, ls. In most cases, it will find a match for the ls function within the /bin directory.

Now it’s almost time to execute! However, the command needs some editing first. Remember how the shell separated your input into unique strings earlier? Now that it knows which file it needs to access, the shell needs the full input to be able to execute your prompt. To do this, it will concatenate ls and -l.

A chart shows “ls -l ‘\0’” separated by characters.
A chart shows “ls -l ‘\0’” separated by characters.

Once the command is ready, the shell will prepare to execute the function. First, the shell forks, creating a child process. The parent process waits while the child process uses the execve function to execute the ls -l command. On success, the shell will display the long version of the current directory, including permissions, the number of hard links, the file owner, file group, and file size, the last time it was modified, and the file name.

A terminal displays the output of ls -l, which is the long version of a list of files in the current directory.
A terminal displays the output of ls -l, which is the long version of a list of files in the current directory.

Once completed, the shell frees all used memory and prints its prompt once more.

Software engineering student and lover of mountains.