Skip to content

5.2 Fork

1. What is fork()?

In Unix-like operating systems (such as Linux), fork() is a system call used to create a new process by duplicating the calling process. The new process is called the child process, and it runs concurrently with the parent process that called fork(). The child process is a copy of the parent, but with its own process ID (PID) and separate memory space.

1.2 Key Points

  • Creates a new process: When a process calls fork(), it creates a child process. The child process is an almost exact copy of the parent, but the two processes have different PIDs.
  • Two return values:
    • In the parent process: fork() returns the PID of the child process.
    • In the child process: fork() returns 0.
  • Resource sharing: The child inherits copies of the parent’s file descriptors, environment variables, and other resources. However, changes made in the child process don’t affect the parent, and vice versa.
  • Concurrent execution: Both the parent and child process continue executing after the fork(), but the order of execution is not guaranteed. The operating system’s scheduler decides which process runs first.

1.3 fork() in C

Here’s a basic example in C that demonstrates how to use fork():

#include <stdio.h>
#include <unistd.h>
#include <sys/types.h>
int main() {
pid_t pid = fork(); // Create a new process
if (pid < 0) {
// Error handling: fork failed
perror("fork failed");
return 1;
}
if (pid == 0) {
// This block is executed by the child process
printf("Child Process: PID = %d\n", getpid());
} else {
// This block is executed by the parent process
printf("Parent Process: PID = %d, Child PID = %d\n", getpid(), pid);
}
return 0;
}
  1. fork() call:
    • If the fork() succeeds, it returns 0 in the child process, and it returns the child’s PID (a positive integer) in the parent process.
  2. In the child process: We print the child’s PID.
  3. In the parent process: We print the parent’s PID and the PID of the child.
  4. Error handling: If fork() fails (e.g., due to resource limits), it returns -1.
Output
Terminal window
Parent Process: PID = 1234, Child PID = 1235
Child Process: PID = 1235

1.4 fork() and Process Tree

Yes, fork() can be used to create a tree of processes. Each time a process calls fork(), it creates a new child process, which can also call fork() to create its own child, and so on. This results in a process tree where the root is the first process (usually the system’s init process or a shell), and each node in the tree represents a process.

Here’s how the tree structure works:

  1. The initial process (often the shell or init process) calls fork() to create a child.
  2. The parent process can continue to fork() additional child processes, forming a branching structure.
  3. Each child process can call fork() to create its own child processes, forming subsequent branches.

For example, if you run a program that calls fork() twice, you can end up with a structure like this:

P (Parent)
/ \
C1 C2 (Children of P)
/ |
C3 C4 (Children of C1 and C2)

Where:

  • P is the original parent process.
  • C1 and C2 are children of P.
  • C3 is a child of C1, and C4 is a child of C2.

Example of fork() Creating a Tree:

#include <stdio.h>
#include <unistd.h>
#include <sys/types.h>
int main() {
pid_t pid1 = fork(); // First fork
if (pid1 == 0) {
// Child 1: Create its own child (Grandchild)
pid_t pid2 = fork(); // Second fork
if (pid2 == 0) {
// Grandchild Process
printf("Grandchild Process: PID = %d, Parent PID = %d\n", getpid(), getppid());
} else {
// Child 1 Process
printf("Child 1 Process: PID = %d, Parent PID = %d\n", getpid(), getppid());
}
} else {
// Parent Process
printf("Parent Process: PID = %d\n", getpid());
}
return 0;
}
Output
Terminal window
Parent Process: PID = 1234
Child 1 Process: PID = 1235, Parent PID = 1234
Grandchild Process: PID = 1236, Parent PID = 1235
Process Tree Visualization
  • Parent (PID 1234)
    • Child 1 (PID 1235)
      • Grandchild (PID 1236)

1.5 Can fork() Be Used as a Tree?

Yes, fork() naturally forms a tree of processes. As each process can create more child processes by calling fork() again, you get a branching structure that resembles a tree. The parent-child relationship forms a tree hierarchy, with each parent process acting as the root or an internal node and each child process being a leaf node or an internal node depending on whether it creates its own children.

1.6 Important Considerations

  • Process ID (PID): Each process in the tree has a unique PID. The root of the tree is typically the first process started (such as init or the shell), and each subsequent process (parent/child) has its own PID.
  • Process Hierarchy: The relationship between processes can be traversed using the getppid() function, which returns the parent PID.
  • Process Tree Management: Operating systems like Linux manage these trees, cleaning up zombie processes (terminated child processes) through process reaping by the parent process.

2. C Program to Copy a Directory

Write a c program called cpdir (using only system calls) that copies all the files (including subdirectories) from one directory to another. Both source and destination must be imported as command-line arguments. If the destination directory exists, your program should prompt the user to overwrite or exit. Each file should be copied in its own process (forked). Your program should read the directory, and for each file, fork, open the file for reading, open a file with the same name in the destination directory for writing, and copy the contents. Don’t forget to handle each child process as it finishes! The name of each file and subdirectory should remain the same.

The following C program is designed to copy a directory (and its contents) from one location to another. It handles both regular files and subdirectories. The program makes use of system calls like open(), read(), write(), and mkdir() to perform file operations, and it uses forking (fork()) to copy files in parallel.

2.1 Including Necessary Libraries

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <dirent.h>
#include <string.h>

These header files are included for the following purposes:

  • stdio.h: Provides input/output functions like printf() and perror().
  • stdlib.h: Includes functions for memory allocation and program control like exit().
  • unistd.h: Defines system calls, such as fork() and read().
  • sys/types.h: Defines data types for system calls, such as pid_t.
  • sys/stat.h: Provides definitions for file status information (used with stat() and mkdir()).
  • fcntl.h: Provides constants for file control operations like open().
  • dirent.h: Provides definitions for directory operations, such as opendir(), readdir(), and closedir().
  • string.h: Provides functions for string manipulation, such as strcmp() and sprintf().

2.2 Copying a Single File (copy_file)

void copy_file(const char *src_file, const char *dest_file) {
char buffer[BUFFER_SIZE];
int src_fd, dest_fd;
ssize_t bytes_read, bytes_written;
// Open source file for reading
src_fd = open(src_file, O_RDONLY);
if (src_fd == -1) {
perror("Error opening source file");
exit(EXIT_FAILURE);
}
// Open destination file for writing
dest_fd = open(dest_file, O_WRONLY | O_CREAT | O_TRUNC, S_IRUSR | S_IWUSR | S_IRGRP | S_IROTH);
if (dest_fd == -1) {
perror("Error opening destination file");
exit(EXIT_FAILURE);
}
// Copy contents from source file to destination file
while ((bytes_read = read(src_fd, buffer, BUFFER_SIZE)) > 0) {
bytes_written = write(dest_fd, buffer, bytes_read);
if (bytes_written != bytes_read) {
perror("Error writing to destination file");
exit(EXIT_FAILURE);
}
}
if (bytes_read == -1) {
perror("Error reading from source file");
exit(EXIT_FAILURE);
}
// Close file descriptors
close(src_fd);
close(dest_fd);
}
Explanation
  • Open the Source and Destination Files:
    • The program opens the source file (src_file) in read-only mode and the destination file (dest_file) in write-only mode with the flags O_CREAT (create the file if it doesn’t exist) and O_TRUNC (truncate the file if it already exists).
  • Reading and Writing:
    • It uses a buffer of size BUFFER_SIZE to read from the source file and write to the destination file. This is done in a loop until all data from the source file is transferred.
    • If there’s any error during reading or writing, it prints the error message and exits.
  • Closing the Files: After the file copy is complete, both source and destination file descriptors are closed.

2.3 Copying a Directory (copy_directory)

void copy_directory(const char *src_dir, const char *dest_dir) {
DIR *dir;
struct dirent *entry;
// Open source directory
dir = opendir(src_dir);
if (dir == NULL) {
perror("Error opening source directory");
exit(EXIT_FAILURE);
}
// Create destination directory
if (mkdir(dest_dir, S_IRWXU | S_IRWXG | S_IROTH | S_IXOTH) == -1) {
perror("Error creating destination directory");
exit(EXIT_FAILURE);
}
// Read source directory
while ((entry = readdir(dir)) != NULL) {
if (strcmp(entry->d_name, ".") == 0 || strcmp(entry->d_name, "..") == 0) {
continue; // Skip current and parent directories
}
char src_path[PATH_MAX];
char dest_path[PATH_MAX];
sprintf(src_path, "%s/%s", src_dir, entry->d_name);
sprintf(dest_path, "%s/%s", dest_dir, entry->d_name);
if (entry->d_type == DT_DIR) {
// Recursively copy subdirectories
copy_directory(src_path, dest_path);
} else if (entry->d_type == DT_REG) {
// Copy regular files
pid_t pid = fork();
if (pid == -1) {
perror("Error forking process");
exit(EXIT_FAILURE);
} else if (pid == 0) {
// Child process
copy_file(src_path, dest_path);
exit(EXIT_SUCCESS);
}
}
}
closedir(dir);
}
Explanation
  • Open the Source Directory: The program opens the source directory using opendir().
  • Create the Destination Directory: It creates the destination directory using mkdir().
  • Iterate Through the Directory:
    • The program reads each entry in the source directory with readdir().
    • It skips the entries for "." and "..", which represent the current and parent directories.
  • Handling Subdirectories and Files:
    • If an entry is a subdirectory (d_type == DT_DIR), it recursively calls copy_directory() to copy the contents of that subdirectory.
    • If an entry is a regular file (d_type == DT_REG), the program forks a child process to handle the file copy in parallel.
    • The child process uses the copy_file() function to copy the file and then exits.
  • Close the Directory: After processing all entries, the program closes the directory with closedir().

2.4 Main Function

int main(int argc, char *argv[]) {
if (argc != 3) {
fprintf(stderr, "Usage: %s <source_directory> <destination_directory>\n", argv[0]);
exit(EXIT_FAILURE);
}
char *src_dir = argv[1];
char *dest_dir = argv[2];
// Check if destination directory already exists
struct stat st;
if (stat(dest_dir, &st) == 0 && S_ISDIR(st.st_mode)) {
// Destination directory exists
char choice;
printf("Destination directory already exists. Do you want to overwrite it? (y/n): ");
scanf(" %c", &choice);
if (choice != 'y' && choice != 'Y') {
printf("Exiting without overwriting.\n");
exit(EXIT_SUCCESS);
}
}
// Copy source directory to destination directory
copy_directory(src_dir, dest_dir);
// Wait for all child processes to finish
int status;
while (wait(&status) > 0);
printf("Directory copied successfully.\n");
return 0;
}
Explanation
  • Argument Checking: The program expects two arguments: a source directory and a destination directory. If the number of arguments is incorrect, it prints a usage message and exits.
  • Check if Destination Exists:
    • The program checks if the destination directory already exists using stat(). If it exists, the program asks the user whether they want to overwrite it.
  • Call copy_directory(): The program initiates the copying process by calling copy_directory().
  • Wait for Child Processes: After the copying process begins, the program waits for all child processes (file copy tasks) to finish by calling wait().
  • Completion Message: Once all files and directories are copied, it prints “Directory copied successfully.”

2.5 Key Concepts

  1. Forking Processes: The program uses fork() to create child processes that copy files concurrently. This improves performance by allowing multiple files to be copied simultaneously.
  2. Directory Traversal: The program uses opendir() and readdir() to traverse directories, handling subdirectories and files.
  3. Recursion: The function copy_directory() is recursive, allowing it to handle nested subdirectories.
  4. File Operations: The program uses open(), read(), write(), and close() to perform low-level file I/O operations.

2.6 Complete C Code

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <dirent.h>
#include <string.h>
#define BUFFER_SIZE 4096
void copy_file(const char *src_file, const char *dest_file) {
char buffer[BUFFER_SIZE];
int src_fd, dest_fd;
ssize_t bytes_read, bytes_written;
// Open source file for reading
src_fd = open(src_file, O_RDONLY);
if (src_fd == -1) {
perror("Error opening source file");
exit(EXIT_FAILURE);
}
// Open destination file for writing
dest_fd = open(dest_file, O_WRONLY | O_CREAT | O_TRUNC, S_IRUSR | S_IWUSR | S_IRGRP | S_IROTH);
if (dest_fd == -1) {
perror("Error opening destination file");
exit(EXIT_FAILURE);
}
// Copy contents from source file to destination file
while ((bytes_read = read(src_fd, buffer, BUFFER_SIZE)) > 0) {
bytes_written = write(dest_fd, buffer, bytes_read);
if (bytes_written != bytes_read) {
perror("Error writing to destination file");
exit(EXIT_FAILURE);
}
}
if (bytes_read == -1) {
perror("Error reading from source file");
exit(EXIT_FAILURE);
}
// Close file descriptors
close(src_fd);
close(dest_fd);
}
void copy_directory(const char *src_dir, const char *dest_dir) {
DIR *dir;
struct dirent *entry;
// Open source directory
dir = opendir(src_dir);
if (dir == NULL) {
perror("Error opening source directory");
exit(EXIT_FAILURE);
}
// Create destination directory
if (mkdir(dest_dir, S_IRWXU | S_IRWXG | S_IROTH | S_IXOTH) == -1) {
perror("Error creating destination directory");
exit(EXIT_FAILURE);
}
// Read source directory
while ((entry = readdir(dir)) != NULL) {
if (strcmp(entry->d_name, ".") == 0 || strcmp(entry->d_name, "..") == 0) {
continue; // Skip current and parent directories
}
char src_path[PATH_MAX];
char dest_path[PATH_MAX];
sprintf(src_path, "%s/%s", src_dir, entry->d_name);
sprintf(dest_path, "%s/%s", dest_dir, entry->d_name);
if (entry->d_type == DT_DIR) {
// Recursively copy subdirectories
copy_directory(src_path, dest_path);
} else if (entry->d_type == DT_REG) {
// Copy regular files
pid_t pid = fork();
if (pid == -1) {
perror("Error forking process");
exit(EXIT_FAILURE);
} else if (pid == 0) {
// Child process
copy_file(src_path, dest_path);
exit(EXIT_SUCCESS);
}
}
}
closedir(dir);
}
int main(int argc, char *argv[]) {
if (argc != 3) {
fprintf(stderr, "Usage: %s <source_directory> <destination_directory>\n", argv[0]);
exit(EXIT_FAILURE);
}
char *src_dir = argv[1];
char *dest_dir = argv[2];
// Check if destination directory already exists
struct stat st;
if (stat(dest_dir, &st) == 0 && S_ISDIR(st.st_mode)) {
// Destination directory exists
char choice;
printf("Destination directory already exists. Do you want to overwrite it? (y/n): ");
scanf(" %c", &choice);
if (choice != 'y' && choice != 'Y') {
printf("Exiting without overwriting.\n");
exit(EXIT_SUCCESS);
}
}
// Copy source directory to destination directory
copy_directory(src_dir, dest_dir);
// Wait for all child processes to finish
int status;
while (wait(&status) > 0);
printf("Directory copied successfully.\n");
return 0;
}