Part 7. Pipelines and Filters
Written by Pantas Manik on 2:11 PMCONCEPT: Unix allows you to connect processes, by letting the standard output of one process feed into the standard input of another process. That mechanism is called a pipe.
Connecting simple processes in a pipeline allows you to perform complex tasks without writing complex programs.
EXAMPLE: Using the more command, and a pipe, you can manage the screen presentation of command output. Examine the contents of the /etc directory by typing:
ls -l /etc | more
to the shell.
If you type "q" to exit the more command, where does the remaining output of the ls command go? The answer lies in the way pipelined processes communicate. When the kernel creates a process, each of the process's three file descriptors (standard input, standard output, standard error) is assigned an area, occupying a small block of the computer's memory. The pipeline is established when the contents of the ls command's output buffer is copied into the input buffer of the more command. When the more command is terminated, the Unix operating system will terminate the ls
command, as it has nowhere to copy it's output.
EXERCISE: How could you use head and tail in a pipeline to display lines 25 through 75 of a file?
ANSWER: The command:
cat file | head -75 | tail -50
would work. The cat command feeds the file into the pipeline. The head command gets the first 75 lines of the file, and passes them down the pipeline to tail. The tail command then filters out all but the last 50 lines of the input it received from head. It is important to note that in the above example, tail never receives the original file, but only sees the 75 lines that were passed to it by the head
command.
It is easy for beginners to confuse the usage of the input/output redirection symbols < and >, with the usage of the pipe. Remember that input/output redirection connects processes with files, while the pipe connects processes with other processes.
Grep
The grep utility is one of the most useful filters in Unix. Grep searches line-by-line for a specified pattern, and outputs any line that matches the pattern. The basic syntax for the grep command is grep [-options] pattern [file]. If the file argument is omitted, grep will read from standard input. It is always best to enclose the pattern within single quotes, to prevent the shell from misinterpreting
the command.
The grep utility recognizes a variety of patterns, and the pattern specification syntax was taken from the vi editor. Here are some of the characters you can use to build grep expressions:
1. The caret (^) matches the beginning of a line.
2. The dollar sign ($) matches the end of a line.
3. The period (.) matches any single character.
4. The asterisk (*) matches zero or more occurrences of the previous character.
5. The expression [a-b] matches any characters that are lexically between a and b.
Note that some of the pattern matching characters are also shell meta characters. If you use one of those characters in a grep command, make sure to enclose the pattern in single quote marks, to prevent the shell from trying to interpret them.
EXAMPLE: Type the command:
grep 'jon' /etc/passwd
to search the /etc/passwd file for any lines containing the string "jon".
EXAMPLE: Type the command:
grep '^jon' /etc/passwd
to see the lines in /etc/passwd that begin with the character string "jon".
EXERCISE: List all the files in the /tmp directory owned by the user root.
EXPLANATION: The command:
ls -l /tmp | grep 'root'
would show a long listing of all files in the /tmp directory that contain the word "root". Note that files not owned by the root user may contain the string "root" somewhere in the name, and would appear in the output, but the grep filter can cut the down the number of lines of output you will have
to look at.
0 comments: Responses to “ Part 7. Pipelines and Filters ”