5.2 Fork
1. What is fork()
?
In Unix-like operating systems (such as Linux), fork()
is a system call used to create a new process by duplicating the calling process. The new process is called the child process, and it runs concurrently with the parent process that called fork()
. The child process is a copy of the parent, but with its own process ID (PID) and separate memory space.
1.2 Key Points
- Creates a new process: When a process calls
fork()
, it creates a child process. The child process is an almost exact copy of the parent, but the two processes have different PIDs. - Two return values:
- In the parent process:
fork()
returns the PID of the child process. - In the child process:
fork()
returns0
.
- In the parent process:
- Resource sharing: The child inherits copies of the parent’s file descriptors, environment variables, and other resources. However, changes made in the child process don’t affect the parent, and vice versa.
- Concurrent execution: Both the parent and child process continue executing after the
fork()
, but the order of execution is not guaranteed. The operating system’s scheduler decides which process runs first.
1.3 fork()
in C
Here’s a basic example in C that demonstrates how to use fork()
:
#include <stdio.h>#include <unistd.h>#include <sys/types.h>
int main() { pid_t pid = fork(); // Create a new process
if (pid < 0) { // Error handling: fork failed perror("fork failed"); return 1; }
if (pid == 0) { // This block is executed by the child process printf("Child Process: PID = %d\n", getpid()); } else { // This block is executed by the parent process printf("Parent Process: PID = %d, Child PID = %d\n", getpid(), pid); }
return 0;}
fork()
call:- If the
fork()
succeeds, it returns0
in the child process, and it returns the child’s PID (a positive integer) in the parent process.
- If the
- In the child process: We print the child’s PID.
- In the parent process: We print the parent’s PID and the PID of the child.
- Error handling: If
fork()
fails (e.g., due to resource limits), it returns-1
.
Output
Parent Process: PID = 1234, Child PID = 1235Child Process: PID = 1235
1.4 fork()
and Process Tree
Yes, fork()
can be used to create a tree of processes. Each time a process calls fork()
, it creates a new child process, which can also call fork()
to create its own child, and so on. This results in a process tree where the root is the first process (usually the system’s init process or a shell), and each node in the tree represents a process.
Here’s how the tree structure works:
- The initial process (often the shell or init process) calls
fork()
to create a child. - The parent process can continue to
fork()
additional child processes, forming a branching structure. - Each child process can call
fork()
to create its own child processes, forming subsequent branches.
For example, if you run a program that calls fork()
twice, you can end up with a structure like this:
P (Parent) / \ C1 C2 (Children of P) / | C3 C4 (Children of C1 and C2)
Where:
P
is the original parent process.C1
andC2
are children ofP
.C3
is a child ofC1
, andC4
is a child ofC2
.
Example of fork()
Creating a Tree:
#include <stdio.h>#include <unistd.h>#include <sys/types.h>
int main() { pid_t pid1 = fork(); // First fork
if (pid1 == 0) { // Child 1: Create its own child (Grandchild) pid_t pid2 = fork(); // Second fork if (pid2 == 0) { // Grandchild Process printf("Grandchild Process: PID = %d, Parent PID = %d\n", getpid(), getppid()); } else { // Child 1 Process printf("Child 1 Process: PID = %d, Parent PID = %d\n", getpid(), getppid()); } } else { // Parent Process printf("Parent Process: PID = %d\n", getpid()); }
return 0;}
Output
Parent Process: PID = 1234Child 1 Process: PID = 1235, Parent PID = 1234Grandchild Process: PID = 1236, Parent PID = 1235
Process Tree Visualization
- Parent (PID 1234)
- Child 1 (PID 1235)
- Grandchild (PID 1236)
- Child 1 (PID 1235)
1.5 Can fork()
Be Used as a Tree?
Yes, fork()
naturally forms a tree of processes. As each process can create more child processes by calling fork()
again, you get a branching structure that resembles a tree. The parent-child relationship forms a tree hierarchy, with each parent process acting as the root or an internal node and each child process being a leaf node or an internal node depending on whether it creates its own children.
1.6 Important Considerations
- Process ID (PID): Each process in the tree has a unique PID. The root of the tree is typically the first process started (such as
init
or the shell), and each subsequent process (parent/child) has its own PID. - Process Hierarchy: The relationship between processes can be traversed using the
getppid()
function, which returns the parent PID. - Process Tree Management: Operating systems like Linux manage these trees, cleaning up zombie processes (terminated child processes) through process reaping by the parent process.
2. C Program to Copy a Directory
Write a c program called cpdir (using only system calls) that copies all the files (including subdirectories) from one directory to another. Both source and destination must be imported as command-line arguments. If the destination directory exists, your program should prompt the user to overwrite or exit. Each file should be copied in its own process (forked). Your program should read the directory, and for each file, fork, open the file for reading, open a file with the same name in the destination directory for writing, and copy the contents. Don’t forget to handle each child process as it finishes! The name of each file and subdirectory should remain the same.
The following C program is designed to copy a directory (and its contents) from one location to another. It handles both regular files and subdirectories. The program makes use of system calls like open()
, read()
, write()
, and mkdir()
to perform file operations, and it uses forking (fork()
) to copy files in parallel.
2.1 Including Necessary Libraries
#include <stdio.h>#include <stdlib.h>#include <unistd.h>#include <sys/types.h>#include <sys/stat.h>#include <fcntl.h>#include <dirent.h>#include <string.h>
These header files are included for the following purposes:
stdio.h
: Provides input/output functions likeprintf()
andperror()
.stdlib.h
: Includes functions for memory allocation and program control likeexit()
.unistd.h
: Defines system calls, such asfork()
andread()
.sys/types.h
: Defines data types for system calls, such aspid_t
.sys/stat.h
: Provides definitions for file status information (used withstat()
andmkdir()
).fcntl.h
: Provides constants for file control operations likeopen()
.dirent.h
: Provides definitions for directory operations, such asopendir()
,readdir()
, andclosedir()
.string.h
: Provides functions for string manipulation, such asstrcmp()
andsprintf()
.
2.2 Copying a Single File (copy_file
)
void copy_file(const char *src_file, const char *dest_file) { char buffer[BUFFER_SIZE]; int src_fd, dest_fd; ssize_t bytes_read, bytes_written;
// Open source file for reading src_fd = open(src_file, O_RDONLY); if (src_fd == -1) { perror("Error opening source file"); exit(EXIT_FAILURE); }
// Open destination file for writing dest_fd = open(dest_file, O_WRONLY | O_CREAT | O_TRUNC, S_IRUSR | S_IWUSR | S_IRGRP | S_IROTH); if (dest_fd == -1) { perror("Error opening destination file"); exit(EXIT_FAILURE); }
// Copy contents from source file to destination file while ((bytes_read = read(src_fd, buffer, BUFFER_SIZE)) > 0) { bytes_written = write(dest_fd, buffer, bytes_read); if (bytes_written != bytes_read) { perror("Error writing to destination file"); exit(EXIT_FAILURE); } }
if (bytes_read == -1) { perror("Error reading from source file"); exit(EXIT_FAILURE); }
// Close file descriptors close(src_fd); close(dest_fd);}
Explanation
- Open the Source and Destination Files:
- The program opens the source file (
src_file
) in read-only mode and the destination file (dest_file
) in write-only mode with the flagsO_CREAT
(create the file if it doesn’t exist) andO_TRUNC
(truncate the file if it already exists).
- The program opens the source file (
- Reading and Writing:
- It uses a buffer of size
BUFFER_SIZE
to read from the source file and write to the destination file. This is done in a loop until all data from the source file is transferred. - If there’s any error during reading or writing, it prints the error message and exits.
- It uses a buffer of size
- Closing the Files: After the file copy is complete, both source and destination file descriptors are closed.
2.3 Copying a Directory (copy_directory
)
void copy_directory(const char *src_dir, const char *dest_dir) { DIR *dir; struct dirent *entry;
// Open source directory dir = opendir(src_dir); if (dir == NULL) { perror("Error opening source directory"); exit(EXIT_FAILURE); }
// Create destination directory if (mkdir(dest_dir, S_IRWXU | S_IRWXG | S_IROTH | S_IXOTH) == -1) { perror("Error creating destination directory"); exit(EXIT_FAILURE); }
// Read source directory while ((entry = readdir(dir)) != NULL) { if (strcmp(entry->d_name, ".") == 0 || strcmp(entry->d_name, "..") == 0) { continue; // Skip current and parent directories }
char src_path[PATH_MAX]; char dest_path[PATH_MAX]; sprintf(src_path, "%s/%s", src_dir, entry->d_name); sprintf(dest_path, "%s/%s", dest_dir, entry->d_name);
if (entry->d_type == DT_DIR) { // Recursively copy subdirectories copy_directory(src_path, dest_path); } else if (entry->d_type == DT_REG) { // Copy regular files pid_t pid = fork(); if (pid == -1) { perror("Error forking process"); exit(EXIT_FAILURE); } else if (pid == 0) { // Child process copy_file(src_path, dest_path); exit(EXIT_SUCCESS); } } }
closedir(dir);}
Explanation
- Open the Source Directory: The program opens the source directory using
opendir()
. - Create the Destination Directory: It creates the destination directory using
mkdir()
. - Iterate Through the Directory:
- The program reads each entry in the source directory with
readdir()
. - It skips the entries for
"."
and".."
, which represent the current and parent directories.
- The program reads each entry in the source directory with
- Handling Subdirectories and Files:
- If an entry is a subdirectory (
d_type == DT_DIR
), it recursively callscopy_directory()
to copy the contents of that subdirectory. - If an entry is a regular file (
d_type == DT_REG
), the program forks a child process to handle the file copy in parallel. - The child process uses the
copy_file()
function to copy the file and then exits.
- If an entry is a subdirectory (
- Close the Directory: After processing all entries, the program closes the directory with
closedir()
.
2.4 Main Function
int main(int argc, char *argv[]) { if (argc != 3) { fprintf(stderr, "Usage: %s <source_directory> <destination_directory>\n", argv[0]); exit(EXIT_FAILURE); }
char *src_dir = argv[1]; char *dest_dir = argv[2];
// Check if destination directory already exists struct stat st; if (stat(dest_dir, &st) == 0 && S_ISDIR(st.st_mode)) { // Destination directory exists char choice; printf("Destination directory already exists. Do you want to overwrite it? (y/n): "); scanf(" %c", &choice); if (choice != 'y' && choice != 'Y') { printf("Exiting without overwriting.\n"); exit(EXIT_SUCCESS); } }
// Copy source directory to destination directory copy_directory(src_dir, dest_dir);
// Wait for all child processes to finish int status; while (wait(&status) > 0);
printf("Directory copied successfully.\n");
return 0;}
Explanation
- Argument Checking: The program expects two arguments: a source directory and a destination directory. If the number of arguments is incorrect, it prints a usage message and exits.
- Check if Destination Exists:
- The program checks if the destination directory already exists using
stat()
. If it exists, the program asks the user whether they want to overwrite it.
- The program checks if the destination directory already exists using
- Call
copy_directory()
: The program initiates the copying process by callingcopy_directory()
. - Wait for Child Processes: After the copying process begins, the program waits for all child processes (file copy tasks) to finish by calling
wait()
. - Completion Message: Once all files and directories are copied, it prints “Directory copied successfully.”
2.5 Key Concepts
- Forking Processes: The program uses
fork()
to create child processes that copy files concurrently. This improves performance by allowing multiple files to be copied simultaneously. - Directory Traversal: The program uses
opendir()
andreaddir()
to traverse directories, handling subdirectories and files. - Recursion: The function
copy_directory()
is recursive, allowing it to handle nested subdirectories. - File Operations: The program uses
open()
,read()
,write()
, andclose()
to perform low-level file I/O operations.
2.6 Complete C Code
#include <stdio.h>#include <stdlib.h>#include <unistd.h>#include <sys/types.h>#include <sys/stat.h>#include <fcntl.h>#include <dirent.h>#include <string.h>
#define BUFFER_SIZE 4096
void copy_file(const char *src_file, const char *dest_file) { char buffer[BUFFER_SIZE]; int src_fd, dest_fd; ssize_t bytes_read, bytes_written;
// Open source file for reading src_fd = open(src_file, O_RDONLY); if (src_fd == -1) { perror("Error opening source file"); exit(EXIT_FAILURE); }
// Open destination file for writing dest_fd = open(dest_file, O_WRONLY | O_CREAT | O_TRUNC, S_IRUSR | S_IWUSR | S_IRGRP | S_IROTH); if (dest_fd == -1) { perror("Error opening destination file"); exit(EXIT_FAILURE); }
// Copy contents from source file to destination file while ((bytes_read = read(src_fd, buffer, BUFFER_SIZE)) > 0) { bytes_written = write(dest_fd, buffer, bytes_read); if (bytes_written != bytes_read) { perror("Error writing to destination file"); exit(EXIT_FAILURE); } }
if (bytes_read == -1) { perror("Error reading from source file"); exit(EXIT_FAILURE); }
// Close file descriptors close(src_fd); close(dest_fd);}
void copy_directory(const char *src_dir, const char *dest_dir) { DIR *dir; struct dirent *entry;
// Open source directory dir = opendir(src_dir); if (dir == NULL) { perror("Error opening source directory"); exit(EXIT_FAILURE); }
// Create destination directory if (mkdir(dest_dir, S_IRWXU | S_IRWXG | S_IROTH | S_IXOTH) == -1) { perror("Error creating destination directory"); exit(EXIT_FAILURE); }
// Read source directory while ((entry = readdir(dir)) != NULL) { if (strcmp(entry->d_name, ".") == 0 || strcmp(entry->d_name, "..") == 0) { continue; // Skip current and parent directories }
char src_path[PATH_MAX]; char dest_path[PATH_MAX]; sprintf(src_path, "%s/%s", src_dir, entry->d_name); sprintf(dest_path, "%s/%s", dest_dir, entry->d_name);
if (entry->d_type == DT_DIR) { // Recursively copy subdirectories copy_directory(src_path, dest_path); } else if (entry->d_type == DT_REG) { // Copy regular files pid_t pid = fork(); if (pid == -1) { perror("Error forking process"); exit(EXIT_FAILURE); } else if (pid == 0) { // Child process copy_file(src_path, dest_path); exit(EXIT_SUCCESS); } } }
closedir(dir);}
int main(int argc, char *argv[]) { if (argc != 3) { fprintf(stderr, "Usage: %s <source_directory> <destination_directory>\n", argv[0]); exit(EXIT_FAILURE); }
char *src_dir = argv[1]; char *dest_dir = argv[2];
// Check if destination directory already exists struct stat st; if (stat(dest_dir, &st) == 0 && S_ISDIR(st.st_mode)) { // Destination directory exists char choice; printf("Destination directory already exists. Do you want to overwrite it? (y/n): "); scanf(" %c", &choice); if (choice != 'y' && choice != 'Y') { printf("Exiting without overwriting.\n"); exit(EXIT_SUCCESS); } }
// Copy source directory to destination directory copy_directory(src_dir, dest_dir);
// Wait for all child processes to finish int status; while (wait(&status) > 0);
printf("Directory copied successfully.\n");
return 0;}