What happens where you type ls -l in a Linux Shell? (Building a simple shell)
To start the first thing you need to understand is what a shell, terminal and a command line is.
The shell is a command line interface. It is a program that takes commands from the keyboard and gives that instructions to the operating system to perform. The most common and used shell program is bash which stands for “Bourne-again shell” , and is used on Linux Systems. In almost all the Linux machines, bash is set as the default shell program and will be located in the
A terminal is basically an emulator that allows the user type and execute programs form keyboard, and is capable of performing tasks faster than through the graphical interface of the operating system.
A command line is a text interface for the computer.
To understand a little bit more what is all of this, and how a shell works, we will see how is the behaviour when we type ls -l and hit enter, and we are going to see how we can replicate this building our own version of a simple and basic shell.
So when we type ls -l on our terminal, we are listing all the files in the current directory in the long list format( the format is given by the -l flag), as you can see in the next image:
But what is really happened behind this?
- The first thing that you see when you are running a shell is the prompt. A prompt should be displayed to let the user type the commands, and it would look like this:
2. Now the shell is waiting for get the user input. The shell reads the user input with a function called getline(), reads a line of text in standard input and stores the address of the buffer of the user input “ls -l”.
$ ls -l
3. Now that we have read, we need to split the iput in individual tokens. Those tokens should be stored in an array (Don´t forget to allocate memory for this array, and free it after used it). You can do this using a strtok function.
4. After we have our arguments stored as tokens, these need to be analyzed before execution, and see if these are aliases (shortcut name for a command, file or anything in the shell), built-in commands(commands or functions that execute directly in the shell itself. In this case we are checking env, printenv and exit) or if is a PATH(environment variable which specifies directories for executables in an operating system), in that specific order.
In the case of the PATH, we need to get the environment of the variable with the getenv() function, and we would get something like this:
We also need to tokenized this path to get each one individually.
5. The next step now is execute the program. For this we need to use the system calls fork(), execv() and wait().
The concept of processes is fundamental to the Unix/Linux operating systems, and all running instance of a program is known as a process. The way to distinguish processes has it’s by ID or identifier and each process has it’s own ID. The ID is a non negative number and associated with the process.
With fork we are going to create our new process( called child process). A parent process uses fork to create a new child process. The child process is a copy of the parent. After fork, both parent and child executes the same program but in separate processes.
After we have create de child process, we use execve to execute ls command and will replace the child process with “ls”, the parent process will wait until the child process completes its execution. Exec system call replaces the program executed by a process. The child may use exec after a fork to replace the process’ memory space with a new program executable making the child execute a different program than the parent.
Once “ls” is executed it goes back to the beginning, prints the prompt, and waits for more user input. This cycle continues until the user runs the exit built-in command, or enters ctrl-D.
Functions and system calls used
For more information type in yout terminal man 3 getline.
The arguments of the getline function are:
- The first argument is the address of the first character position where the input string will be stored.
- The second argument is the address of the variable that holds the size of the input buffer, another pointer.
- The thrid arguments is the input file handle, stdin in this case. So you could use getline() to read a line of text from a file, but when
stdinis specified, standard input is read.
- Return: The number of characters read on success, and -1 on failure reading the line.
For more information type in yout terminal man 3 strtok.
The arguments of the strtokfunction are:
- The first argument is the string to be split.
- The second argument is the delimiter
- Return: A pointer to the next token, and NULL if there is no more tokens.
For more information type in yout terminal man 3 getenv.
- The first argument is environment variable name.
- Return: a pointer to the value in the environment, or NULL if there is no match
For more information type in yout terminal man 2 fork.
- Return: On success, the PID of the child process is returned in the parent, and 0 is returned in the child. On failure, -1 is returned in the parent, no child process is created.
For more information type in yout terminal man 2 execve.
- Return: Nothing on succes, and -1 on failure
For more information type in yout terminal man 2 wait.
- Return: on success, returns the process ID of the terminated child; on error, -1 is returned.
I hope you find this information useful to understand a little bit more what is the process a shell is doing behind, when we want to execute a simple command like ls -l.
Source code of a basic shell can be found here.