"Creating Your Own Command Interpreter: Main Concepts Guide: Simple Shell - Part 2"
Deeper into UNIX: Exploring the Inner Workings of UNIX with a Custom Interpreter
Welcome to Part 2 of our blog series on shells and processes! In the first part, we explored some of the fundamental concepts behind shells and processes, including how shells work, what PIDs and PPIDs are, how to manipulate the environment of the current process, and the difference between functions and system calls.
In this part of our blog series on shells and processes, we'll explore some key concepts related to processes and how they interact with the operating system.
By the end of this part, you'll have a solid understanding of how to create and manage processes in a shell environment, and how to interact with other programs using input/output redirection. So let's dive in and continue our exploration of the fascinating world of shells and processes!
How to create processes ?
Why ? : Creating new processes is an essential part of modern operating systems, and it is necessary for a wide range of tasks, such as: running multiple programs concurrently: By creating separate processes for each program, the operating system can run multiple programs at the same time, even on a single-core processor. This allows users to work on multiple tasks simultaneously without having to wait for one program to finish before starting another.
In C, you can create new processes using the fork system call, which creates a new process by duplicating the existing process. Here's an example of how to create a new process in a simple shell.
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
int main() {
pid_t pid;
// Fork a new process
pid = fork();
if (pid < 0) {
// Error
fprintf(stderr, "Fork failed\n");
exit(1);
} else if (pid == 0) {
// Child process
printf("This is the child process\n");
exit(0);
} else {
// Parent process
printf("This is the parent process\n");
exit(0);
}
return 0;
}
In this example, the fork()
system call is used to create a new process. The fork()
call returns the process ID of the new child process to the parent process, and returns 0 to the child process. In this case, the parent process prints "This is the parent process", and the child process prints "This is the child process".
note that pid_t
is a data type in the C programming language that is used to represent process IDs (PIDs). It is a signed integer type and is defined in the sys/types.h
header file.
FUNCTIONS TO KNOW :
getpid
()
is a function in the C library that returns the process ID (PID) of the calling process. It is defined in theunistd.h
header file and takes no arguments. - The PID returned bygetpid()
is a unique identifier assigned by the operating system to each process running on a system.
Here's an example of how to use getpid()
:
wait
()
is a system call in the C library that suspends the calling process until one of its child processes terminates. It is defined in the sys/wait.h header file and takes a pointer to an integer variable that will be set to the exit status of the terminated child process.
What are the three prototypes of main ?
In the C programming language, the main()
function is the entry point of a program. It is the first function that is executed when a program starts running. There are three valid prototypes for the main()
function in C, which are:
int main()
: This is the most common prototype for themain()
function in C. It takes no arguments and returns an integer value to the operating system that indicates the success or failure of the program.int main(int argc, char *argv[])
: This prototype takes two arguments. The first argument,argc
, is an integer that represents the number of command-line arguments passed to the program. The second argument,argv
, is an array of strings that contains the command-line arguments themselves.int main(int argc, char *argv[], char *envp[])
: This prototype takes three arguments. The first two arguments are the same as in the previous prototype, but the third argument,envp
, is an array of strings that contains the program's environment variables.
Here's an example of how to use the second prototype of the main()
function:
#include <stdio.h>
int main(int argc, char *argv[]) {
printf("Number of arguments: %d\n", argc);
for (int i = 0; i < argc; i++) {
printf("Argument %d: %s\n", i, argv[i]);
}
return 0;
}
Here's an example of how to use the third prototype of the main()
function:
#include <stdio.h>
int main(int argc, char *argv[], char *envp[]) {
for (int i = 0; envp[i] != NULL; i++) {
printf("Environment variable %d: %s\n", i, envp[i]);
}
return 0;
}
In this example, the main()
function takes three arguments: argc
and argv
, as in the previous example, and envp, which is an array of strings that contains the program's environment variables. The program prints out each environment variable, one by one.
note that upon gcc you may encounter undefined reference to `main' error, that means you need to define the _GNU_SOURCE
macro, which includes support for the envp
parameter in the main()
function, so when you compile it use:
gcc example.c -o example -D_GNU_SOURCE
argv
versus envp:
Command-line arguments: example, Input file: A program that reads data from a file can take the name of the file as a command-line argument. For example,
myprogram input.txt
would run the programmyprogram
and pass the name of the input fileinput.txt
as an argument.Environment variables that can be set using
envp
in C: some examplesPATH: The
PATH
environment variable specifies which directories the system should search for executable files. For example,PATH=/usr/local/bin:/usr/bin:/bin
would tell the system to search for executable files in the directories/usr/local/bin
,/usr/bin
, and/bin
.HOME: The
HOME
environment variable specifies the home directory of the current user. For example,HOME=/home/username
would specify that the home directory of the current user is/home/username
.
How does the shell use the PATH
to find the programs ?
The PATH
environment variable is a string that contains a list of directories separated by a delimiter (usually a colon :
on Unix-like systems and a semicolon ;
on Windows). The PATH
variable is used by the shell or the operating system to search for executable files when a command is entered.
This process allows the user to run a command from anywhere on the system, without having to specify the full path to the executable file. It also allows the user to customize the system by adding directories to the PATH
variable, which can be useful for installing custom software or modifying the behaviour of existing programs.
Here's how the shell uses the PATH
environment variable to find a program:
The user types a command into the shell, such as ls.
The shell looks up the value of the
PATH
environment variable.The shell splits the
PATH
variable into individual directory names based on the:
delimiter. For example, ifPATH
is set to/usr/local/bin:/usr/bin:/bin
, the shell will split it into the following directories:/usr/local/bin
,/usr/bin
, and/bin
.The shell looks for the executable file in each directory in the
PATH
variable, in the order listed. For example, the shell would look forls
in the following directories:/usr/local/bin/ls
,/usr/bin/ls
, and/bin/ls
.If the shell finds the executable file, it executes it. If the shell does not find the executable file in any of the directories listed in
PATH
, it displays an error message.For example, suppose you have a program called
myprogram
installed in the directory/usr/local/bin
. If thePATH
variable is set to/usr/local/bin:/usr/bin:/bin
, and you enter the commandmyprogram
in a shell, the shell will search for an executable file calledmyprogram
in the directories/usr/local/bin
,/usr/bin
, and/bin
, in that order. If it finds an executable file calledmyprogram
in the directory/usr/local/bin
, it will execute that file.
You can modify the PATH
variable to include additional directories where executable files may be located. For example, if you install a program in the directory /opt/myprogram
, you can add that directory to the PATH
variable by appending the following line to your shell configuration file (such as .bashrc
or .zshrc
):
export PATH=$PATH:/opt/myprogram
This adds the directory /opt/myprogram
to the end of the PATH
variable, so that the shell will search for executable files in that directory as well.
In this part, we have explored the basics of creating processes in C programming language. We have discussed the three prototypes of the main
function, which is the entry point of any C program. Additionally, we have looked at how the shell uses the PATH
environment variable to find programs when commands are entered.
In the next part of this blog series, we will delve deeper into the topic of process creation and management by looking at how to execute another program using the execve
system call. We will also explore how to suspend the execution of a process until one of its children terminates.
Go to Part 1 : PART 1 Simple Shell
Go to Part 3 : PART 3 Simple Shell