Skip to main content

I/O redirection and pipelining

In programming there are three standard 'streams' of communication

  • standard input (stdin)
  • standard output (stdout)
  • standard error (stderr)

Stdin is what goes into a command, stdout is what comes out of it, and stderr is somewhere for the command to write messages about how things are going (such as error messages).

One of the most powerful features of the shell is the ability to 'redirect' output, and to 'pipe' this output into other commands. This allows you to built up pipelines of operations that do sophisticated things.

Redirecting standard Output

By default, stdout from a command is sent to the terminal, and that's what you're seeing when you type ls, etc.

However, sometimes you might need instead to have this output written to a file instead.

As we saw already, if you follow a command with > and the name of a file, then stdout will go there instead, so

% ls > myfiles.txt

will send no output to the terminal, but instead creates a file myfiles.txt. We say that the output has been "redirected" to the file.

Warning

Be CAREFUL. Like most operations that write to files, this command will overwrite the file if it exists already

  • so you could lose data. As usual you have to make sure you know what you are doing first. (If in doubt, check with ls.)

If you use >> instead of >, then the output will be appended to the file rather than overwriting it. So for example to list all files into the file a second time as well:

% ls >> myfiles.txt

Note

Remember you can use cat to look at the contents of a file.

Pipelines and standard Input

Many commands can also accept input from stdin. A good example is wc: if it's not given a filename, wc will work on stdin instead.

The best way to get input in to stdin is to pipe it in, using the | character. For example, if you read the section on working with files you know that the cat command will print out the contents of a file:

# cat myfiles.txt

On the other hand the wc -l counts the number of lines in its input. To count the number of lines in the file, we just have to connect one to the other:

% cat myfiles.txt | wc -l

we say the output of cat has been "piped" into the input of wc.

Note

It's also possible to redirect a file into the input of wc -l directly - like this:

% wc -l < myfiles.txt

However, I find this more confusing because I prefer thinking of the data as flowing from left to right along the pipeline, so I don't generally use this.

Although this particular command could be done more simply just by writein wc -l myfiles.txt, the pipelining here is much more flexible and more powerful. For example, suppose we wanted to count the number of words in the first row of the sonnet we were reading. We could combine commands from the working with files page:

% cat sonnet.txt | head -n 1 | wc -w
Note

Remember that you can get help on a command like wc by viewing its man page - man wc. Press q to quit the man page.

For example here are some quick recipes.

The number of lines and words in sonnet.txt:

% cat sonnet.txt | wc -l
% cat sonnet.txt | wc -w

The number of words in the first and last line:

% cat sonnet.txt | head -n 1 | wc -w
% cat sonnet.txt | tail -n 1 | wc -w

...or the number of lines that contain the word 'fair' (using grep, which we shall cpver on the next page).

% cat sonnet.txt | grep 'fair' | wc -l
Quick tip: the command history

As you build these pipelines, you may be getting bored of typing the same commands and filenames over and over again. However, one thing that helps is that the command line keeps a command history. To see commands you're run before, use history:

% history

Even better is that you can get back to these commands by pressing the up arrow (or forwards through them again by pressing the down arrow ). Try this now: press to get back to that previous command and try editing it to use wc -w in place of wc -l. Bingo, in only a few keypresses you've changed to count words instead of lines.

Using this together with tab completion can save a great deal of typing and makes building pipelines and repeating commands quite quick. See this page for more on the history.


## Combining pipelines with redirection

Of course you can also combine pipelines with redirection.
For let's conut the number of text files again and store it in a new file:

% ls '*.txt' | wc -l > number_of_text_files.txt


(You can use `ls` and `cat` to check it got the answer right.)

There is one other useful way to get the output into a file - the `tee` command. This takes input on stdin,
echoes it to stdout, but also sends a copy to a file as well. (You can imagine a letter T with standard input coming
from the left, standard output going to the right, and a copy being dropped in a file at the bottom.)

So let's list all the text files into text_files.txt, but also count them:

~~~~
% ls '*.txt' | tee text_files.txt | wc -l
~~~~

Pipelining works because many programs read from stdin and write to stdout. Typically they work one line at a time. Therefore you can build up pipelines made up of commands that each process the stream of lines in one way, passing on the results to the next command in the pipeline. At the end, you can redirect the output into a file.

:::

Next steps

Finally we will learn about a set of commands that can perform complex filtering operations on files.