In Part 1, we covered redirection and filters. In this post, we are taking things further by looking at pipes, special characters, and text manipulation tools. These are what make the command line truly powerful, allowing you to chain commands together and process data in flexible ways.
The Pipe Operator
The pipe operator | connects two commands by sending the output of one directly into the input of the next. Instead of saving intermediate results to a file, data flows straight from one command to another.
For example, to list the contents of a directory and search for a specific file:
ls | grep "report"Or to sort a file and remove duplicates in one step:
sort names.txt | uniqYou can chain as many commands as you need. Each one receives the output of the previous and passes its own output to the next.
Practical Examples Using Pipes
Counting word frequency in a file:
cat article.txt | grep -oE '\w+' | sort | uniq -c | sort -nrThis reads the file, extracts individual words, sorts them, counts each unique word, and sorts the results by frequency.
Finding the largest files in a directory:
find /path/to/directory -name "*.pdf" | xargs du -h | sort -rhThis finds all PDF files, checks their sizes, and sorts them from largest to smallest.
Analyzing a log file:
cat server.log | grep -oE 'GET /[^ ]+' | sort | uniq -c | sort -nr > popular_pages.txtThis extracts URLs from a log file, counts how often each one appears, and saves the results to a file.
Special Characters in the Shell
Special characters have specific meanings in the shell. Understanding them helps you write more precise commands and avoid unexpected behavior.
Whitespace separates commands and their arguments. For example:
ls -lSingle quotes (') preserve the exact text inside them, including any special characters or variables:
echo 'Hello, $USER'This prints Hello, $USER literally, without substituting the variable.
Double quotes (") preserve most characters but still allow variable substitution:
echo "Hello, $USER"This prints your actual username in place of $USER.
Backslash (\) escapes a special character, making it literal:
echo "This is a \"quoted\" word."Hash (#) starts a comment. Anything after it on the same line is ignored by the shell:
# This is a commentSemicolon (;) lets you run multiple commands on a single line, one after the other:
command1 ; command2Tilde (~) is a shortcut for your home directory:
cd ~Text Manipulation Tools
Beyond the filters covered in Part 1, there are a few more tools worth knowing for working with text directly.
sed is a stream editor that can find and replace text, delete lines, or print specific lines from a file. To print the fifth line of a file:
sed -n '5p' story.txtTo replace every occurrence of the word "old" with "new":
sed 's/old/new/g' filename.txtawk is a more powerful text processing tool that works on structured data. To print the fifth line of a file:
awk 'NR == 5' story.txtTo print the first and third columns of a space separated file:
awk '{print $1, $3}' data.txtrev reverses the characters on each line. Simple but occasionally useful:
echo "hello" | revThis outputs olleh.
cat can also be used to concatenate multiple files and print the combined output:
cat file1.txt file2.txtPutting It All Together
The real power of the shell comes from combining these tools. Here is an example that cleans up a messy CSV file:
cat messy_data.csv | sed 's/ *, */,/g' | awk -F ',' '{print $1,$3,$2}' > clean_data.csvThis reads the file, cleans up spacing around commas, reorders the columns, and saves the result to a new file. Each command does one specific job, and together they handle a task that would take much longer to do manually.
Final Thoughts
Pipes, special characters, and text manipulation tools are what turn the shell from a simple command runner into a genuinely powerful environment for processing data. The best way to get comfortable with them is to start using them on real files and real problems. Start small, chain two commands together, and build from there.