5.3 Pipes
1. What are Pipes?
In computing, a pipe is a mechanism used for inter-process communication (IPC), which allows data to be passed from one process to another. A pipe acts as a communication channel, enabling the output of one process (typically the write end) to become the input of another process (typically the read end).
Pipes are commonly used in Unix-like operating systems (such as Linux and macOS), and they can be a powerful tool for building complex, multi-process programs.
1.1 Types of Pipes
-
Anonymous Pipes:
- Anonymous pipes are the simplest form of pipes. They are typically used for communication between related processes (like a parent process and a child process).
- They are created using the
pipe()
system call in Unix-like systems. - These pipes are unidirectional, meaning data flows in one direction only — either from the write end to the read end, or vice versa.
- Anonymous pipes are generally used for communication between processes that share a common ancestor, like a parent and a child process.
-
Named Pipes (FIFOs):
- Named pipes (also called FIFOs, which stands for First In, First Out) are similar to anonymous pipes but are persistent and can be accessed via a specific filename in the filesystem.
- They are not limited to communication between related processes, and they can be used between unrelated processes.
- Named pipes are created using the
mkfifo()
system call and are typically represented as files in the file system.
Example of a Named Pipe
Any process can write and read into the pipe. Here is an example on the bash terminal:
mkfifo my_pipegzip -c < my_pipe >> out.gz &
Command mkfifo creates the named pipe file called my_pipe
. Afterwards, we run the command
gzip -c
to run in the background. This gzip is used to zip any data that is coming from my_pipe
and continuously appending the result into file out.gz
. Now, we can try to dump any
data into the pipe such as:
cat file > my_pipe
Since gzip
is still running in the background, it will automatically read the data and zip it, followed by appending it to out.gz
. We can remove the pipe anytime with command: rm my_pipe
1.2 Key Concepts
-
Pipe System Calls:
pipe()
: Creates a pipe, providing two file descriptors — one for reading and one for writing.read()
: Reads data from the read end of the pipe.write()
: Writes data to the write end of the pipe.close()
: Closes a file descriptor (read or write end).dup2()
: Redirects a file descriptor to another, often used to redirect input/output to/from pipes.mkfifo()
: Creates a named pipe (FIFO) in the filesystem.
-
Unidirectional Communication:
- Pipes are typically unidirectional, meaning data flows from the write end to the read end. However, you can create multiple pipes for bidirectional communication between two processes.
-
File Descriptors:
- A pipe is represented by two file descriptors:
- One for reading (
pipefd[0]
). - One for writing (
pipefd[1]
).
- One for reading (
- These file descriptors are used to interact with the pipe in the same way you interact with regular files, through
read()
,write()
, andclose()
system calls.
- A pipe is represented by two file descriptors:
1.3 How Pipes Work
When a pipe is created, it essentially creates a buffer between the two processes. The write-end of the pipe writes data into the buffer, and the read-end reads data from the buffer. This process happens in a first-in, first-out (FIFO) manner.
Example Workflow
-
Process A (parent) writes data to the pipe:
- It uses the write-end (
pipefd[1]
) to write data to the pipe.
- It uses the write-end (
-
Process B (child) reads data from the pipe:
- It uses the read-end (
pipefd[0]
) to read the data written by Process A.
- It uses the read-end (
The operating system handles the transfer of data from the write-end to the read-end of the pipe. If there is no data to read, the read() call will block (wait) until there is something to read.
1.4 Example of Using Pipes in C (Anonymous Pipe)
Here is an example where we create a pipe, fork a child process, and have the child process write data to the pipe while the parent reads it.
#include <stdio.h>#include <stdlib.h>#include <unistd.h>
int main() { int pipefd[2]; pid_t pid;
// Create a pipe if (pipe(pipefd) == -1) { perror("pipe"); exit(EXIT_FAILURE); }
// Fork a child process pid = fork();
if (pid == -1) { perror("fork"); exit(EXIT_FAILURE); } else if (pid == 0) { // Child process: Write to pipe close(pipefd[0]); // Close unused read end write(pipefd[1], "Hello from child!", 18); close(pipefd[1]); // Close write end after use exit(EXIT_SUCCESS); } else { // Parent process: Read from pipe close(pipefd[1]); // Close unused write end char buffer[100]; read(pipefd[0], buffer, sizeof(buffer)); printf("Parent received: %s\n", buffer); close(pipefd[0]); // Close read end after use wait(NULL); // Wait for child to finish }
return 0;}
Explanation of the Example:
- Pipe Creation: The
pipe()
system call creates a pipe, resulting in two file descriptors (pipefd[0]
for reading andpipefd[1]
for writing). - Forking: The
fork()
system call creates a child process. - Child Process:
- The child process writes a message (
"Hello from child!"
) to the pipe using the write end (pipefd[1]
). - After writing, it closes the write end and exits.
- The child process writes a message (
- Parent Process:
- The parent process reads the data from the pipe using the read end (
pipefd[0]
). - It then prints the data to the terminal and waits for the child to finish.
- The parent process reads the data from the pipe using the read end (
1.5 Use Cases for Pipes
-
Command Piping in Shells:
- Pipes are commonly used in command-line environments (such as Unix/Linux shells) to chain commands together. For example:
This command lists the contents of the current directory and pipes the output to the
Terminal window ls | grep "dev"grep
command to filter the results.
- Pipes are commonly used in command-line environments (such as Unix/Linux shells) to chain commands together. For example:
-
Process Communication:
- Pipes can be used for communication between a parent and child process, or between any two processes, as long as they are related and share the pipe.
-
Data Streaming:
- Pipes can be used in systems that require streaming data between processes, such as logging systems, real-time data processing, etc.
1.6 Advantages and Limitations of Pipes
Advantages
- Simple: Pipes are a simple and efficient method of communication between processes.
- Speed: They are faster than many other IPC mechanisms like message queues or shared memory for small amounts of data.
- Built-In: Available natively in Unix-like operating systems and easy to implement using system calls.
Limitations
- Unidirectional: Standard pipes are unidirectional. For bidirectional communication, you need two pipes or another IPC method.
- Buffer Size: Pipes typically have a limited buffer size (e.g., 4 KB or 64 KB), which may result in blocking if the buffer is full.
- Related Processes Only: Anonymous pipes are typically used only for communication between related processes (parent-child), not unrelated processes.
2. Sample Program
Write a C program that uses fork, exec and pipe to perform the equivalent of the shell command:
[user@pc]$ ls /dev | head -25
The following C program demonstrates inter-process communication (IPC) using pipes. It creates a pipe, forks a child process, and redirects the output of the ls
command (which lists files in the /dev
directory) to the parent process via the pipe. The parent then reads the first 25 lines of the output from the pipe and prints them to the terminal.
2.1 Header Files
#include <stdio.h>#include <stdlib.h>#include <unistd.h>#include <sys/types.h>#include <sys/wait.h>
stdio.h
: Provides functions likeprintf()
for printing to the console.stdlib.h
: Includes functions likeexit()
for terminating the program with a status code.unistd.h
: Provides system calls such aspipe()
,fork()
, anddup2()
.sys/types.h
: Defines data types likepid_t
, which is used for process IDs.sys/wait.h
: Defines macros for waiting on child processes, specificallywait()
.
2.2 Create a Pipe
int pipefd[2];if (pipe(pipefd) == -1) { perror("pipe"); exit(EXIT_FAILURE);}
pipe(pipefd)
creates a pipe with two file descriptors:pipefd[0]
is the read end of the pipe.pipefd[1]
is the write end of the pipe.
- If
pipe()
fails (e.g., due to resource limitations), it prints an error and exits the program.
2.3 Fork a Child Process
pid = fork();if (pid == -1) { perror("fork"); exit(EXIT_FAILURE);} else if (pid == 0) { // Child process code} else { // Parent process code}
fork()
creates a new process:- The parent process receives the child’s PID (a positive integer).
- The child process receives
0
.
- If
fork()
fails, it prints an error and exits the program.
2.4 Child Process
if (pid == 0) { // Child process
// Redirect stdout to the write end of the pipe if (dup2(pipefd[1], STDOUT_FILENO) == -1) { perror("dup2"); exit(EXIT_FAILURE); }
// Close unused read end of the pipe close(pipefd[0]);
// Execute the command "ls /dev" using exec char *args[] = {"ls", "/dev", NULL}; execv("/bin/ls", args);
// If exec fails perror("exec"); exit(EXIT_FAILURE);}
- Redirecting
stdout
:- The
dup2()
system call redirects the child’s standard output (STDOUT_FILENO
, file descriptor 1) to the write end of the pipe (pipefd[1]
). - After this, any output the child writes to
stdout
will go into the pipe instead of to the terminal.
- The
- Close Unused Read End:
- The child process closes
pipefd[0]
, as it will not be using the read end of the pipe.
- The child process closes
- Execute the Command:
execv("/bin/ls", args)
replaces the child process with thels
command that lists files in the/dev
directory.execv()
runs the command specified (in this case,ls /dev
), and the child process no longer exists in its original form after execution.- If
execv()
fails (e.g., because the executable is not found), the child process prints an error message and exits.
2.5 Parent Process
else { // Parent process
// Close unused write end of the pipe close(pipefd[1]);
// Read from the read end of the pipe and print the first 25 lines char buffer; int count = 0;
while (count < 25 && read(pipefd[0], &buffer, sizeof(buffer)) != 0) { printf("%c", buffer); if (buffer == '\n') { count++; } }
// Close the read end of the pipe close(pipefd[0]);
// Wait for the child process to finish wait(NULL);}
- Close Unused Write End:
- The parent closes
pipefd[1]
because it will only be reading from the pipe (not writing).
- The parent closes
- Reading from the Pipe:
- The parent reads characters from the pipe using
read(pipefd[0], &buffer, sizeof(buffer))
. The program continues reading characters until 25 newlines (\n
) are encountered, effectively printing the first 25 lines of output from thels /dev
command.
- The parent reads characters from the pipe using
- Close the Read End:
- After reading, the parent closes
pipefd[0]
to clean up resources.
- After reading, the parent closes
- Waiting for the Child:
- The parent calls
wait(NULL)
to wait for the child process to finish executing before it terminates. This prevents the parent from finishing before the child and avoids leaving a “zombie” child process.
- The parent calls
2.6 Program Execution Flow
- Step 1: The parent creates a pipe.
- Step 2: The program calls
fork()
to create a child process. - Step 3 (Child Process):
- The child redirects its output to the pipe using
dup2()
. - It then executes the
ls /dev
command usingexecv()
. The output ofls /dev
will be written into the pipe.
- The child redirects its output to the pipe using
- Step 4 (Parent Process):
- The parent closes the write end of the pipe, then reads the output from the pipe.
- It prints the first 25 lines of output from
ls /dev
.
- Step 5: The parent waits for the child process to finish using
wait()
, ensuring that both processes terminate cleanly.
Key Concepts
- Pipe: A pipe is used to enable communication between processes. One process writes to the pipe, and the other reads from it.
- Forking: The
fork()
system call creates a child process. The parent and child processes then execute in parallel. - Redirection: The
dup2()
system call redirects the standard output of the child process to the write end of the pipe. - Exec: The
execv()
system call replaces the child process’s image with the specified command, in this case,ls /dev
. - IPC (Inter-Process Communication): Pipes are used for communication between the parent and child processes. The child writes output to the pipe, and the parent reads from it.
Example Output
When the program runs, it will print the first 25 lines of the output of ls /dev
. The /dev
directory contains device files, so you will likely see entries such as:
ttytty0tty1tty2...
(Note: The actual output will depend on your system’s /dev
directory.)
2.7 Complete C Code
#include <stdio.h>#include <stdlib.h>#include <unistd.h>#include <sys/types.h>#include <sys/wait.h>
int main() { int pipefd[2]; pid_t pid;
// Create a pipe if (pipe(pipefd) == -1) { perror("pipe"); exit(EXIT_FAILURE); }
// Fork a child process pid = fork();
if (pid == -1) { perror("fork"); exit(EXIT_FAILURE); } else if (pid == 0) { // Child process
// Redirect stdout to the write end of the pipe if (dup2(pipefd[1], STDOUT_FILENO) == -1) { perror("dup2"); exit(EXIT_FAILURE); }
// Close unused read end of the pipe close(pipefd[0]);
// Execute the command "ls /dev" using exec char *args[] = {"ls", "/dev", NULL}; execv("/bin/ls", args);
// If exec fails perror("exec"); exit(EXIT_FAILURE); } else { // Parent process
// Close unused write end of the pipe close(pipefd[1]);
// Read from the read end of the pipe and print the first 25 lines char buffer; int count = 0;
while (count < 25 && read(pipefd[0], &buffer, sizeof(buffer)) != 0) { printf("%c", buffer); if (buffer == '\n') { count++; } }
// Close the read end of the pipe close(pipefd[0]);
// Wait for the child process to finish wait(NULL); }
return 0;}
3. STDOUT vs STDOUT_FILENO
3.1 What is STDOUT_FILENO
?
STDOUT_FILENO
is a file descriptor constant in C that represents the standard output stream. It is typically used in system calls (like write()
or dup2()
) to refer to the file descriptor for standard output.
The value of STDOUT_FILENO
is usually 1 (though it is defined as a macro in system headers). In Unix-like systems, standard output corresponds to file descriptor 1. So when you use STDOUT_FILENO
, you’re working with the file descriptor for standard output.
3.2 Is STDOUT_FILENO
the same as STDOUT
?
No, STDOUT_FILENO
is not exactly the same as STDOUT
. Here’s how they differ:
-
STDOUT_FILENO
:- It is a file descriptor (an integer value, typically
1
). - It is used in low-level system calls like
write()
,dup2()
, orclose()
, which work with file descriptors. - It represents the standard output stream at the file descriptor level.
For example:
write(STDOUT_FILENO, "Hello, World!\n", 14);This uses the file descriptor to write directly to the standard output.
- It is a file descriptor (an integer value, typically
-
STDOUT
:- It is typically defined as a
FILE *
(a pointer to aFILE
object). - It is used in high-level standard I/O functions like
fprintf()
,fputs()
, orfscanf()
, which work withFILE *
pointers. - It represents the standard output stream as a
FILE
object, which is buffered.
For example:
fprintf(stdout, "Hello, World!\n");Here,
stdout
is used with thefprintf()
function, which writes to aFILE *
. - It is typically defined as a