Text Processing

Text Processing

These commands help you view, manipulate, search, and transform text files on your Linux system.

cat

Concatenates and displays file contents.

cat [options] [file(s)]

Options:

Examples:

# Compare two files
diff file1.txt file2.txt

# Unified format (for patches)
diff -u file1.txt file2.txt > changes.patch

# Side-by-side comparison
diff -y file1.txt file2.txt

# Ignore whitespace changes
diff -w file1.txt file2.txt

# Compare directories
diff -r dir1/ dir2/

# Create patch file
diff -Naur original/ modified/ > changes.patch

comm

Compares two sorted files line by line.

comm [options] file1 file2

Options:

Examples:

# Compare two sorted files
comm sorted1.txt sorted2.txt

# Show only lines unique to first file
comm -23 sorted1.txt sorted2.txt

# Show only lines unique to second file
comm -13 sorted1.txt sorted2.txt

# Show only lines common to both files
comm -12 sorted1.txt sorted2.txt

fold

Wraps each input line to fit specified width.

fold [options] [file(s)]

Options:

Examples:

# Wrap lines to 40 columns
fold -w 40 file.txt

# Wrap at spaces (avoid breaking words)
fold -w 40 -s file.txt

fmt

Simple text formatter.

fmt [options] [file(s)]

Options:

Examples:

# Format with 60-column width
fmt -w 60 file.txt

# Format with uniform spacing
fmt -u file.txt

column

Formats input into multiple columns.

column [options] [file(s)]

Options:

Examples:

# Format in columns
ls | column

# Format delimited file as table
column -t -s, file.csv

# Format with custom output delimiter
column -t -s, -o '|' file.csv

tee

Reads from standard input and writes to standard output and files.

tee [options] [file(s)]

Options:

Examples:

# Write output to both terminal and file
command | tee output.txt

# Append to existing file
command | tee -a output.txt

# Write to multiple files
command | tee file1.txt file2.txt

# Use in a pipeline
command1 | tee output.txt | command2

xargs

Builds and executes commands from standard input.

xargs [options] [command]

Options:

Examples:

# Simple usage
find . -name "*.txt" | xargs cat

# Replace string in command
find . -name "*.txt" | xargs -I{} cp {} /backup/

# Limit arguments per command
find . -name "*.txt" | xargs -n 2 echo

# Null-terminated input (good for filenames with spaces)
find . -name "*.txt" -print0 | xargs -0 cat

# Prompt before execution
find . -name "*.txt" | xargs -p rm

jq

Command-line JSON processor.

jq [options] filter [file(s)]

Options:

Examples:

# Pretty-print JSON
cat file.json | jq '.'

# Extract specific field
cat file.json | jq '.name'

# Extract nested field
cat file.json | jq '.user.address.city'

# Format array elements
cat file.json | jq '.users[]'

# Filter array
cat file.json | jq '.users[] | select(.age > 30)'

# Transform data
cat file.json | jq '.users[] | {name: .name, age: .age}'

# Count elements
cat file.json | jq '.users | length'

# Raw output (no quotes)
cat file.json | jq -r '.name'
``` Number non-blank output lines
- `-s`: Suppress repeated empty output lines
- `-A`: Show all non-printing characters
- `-T`: Show tabs as ^I
- `-E`: Show end of line as $

**Examples:**
```bash
# Display file contents
cat file.txt

# Show line numbers
cat -n file.txt

# Concatenate multiple files
cat file1.txt file2.txt > combined.txt

# Create a file with content
cat > newfile.txt
This is a line of text.
Press Ctrl+D to save and exit

# Append to an existing file
cat >> file.txt
Additional text
Press Ctrl+D to save and exit

less

A pager that allows backward and forward navigation through file contents.

less [options] file

Options:

Navigation Commands (when less is running):

Examples:

# View a file with less
less file.txt

# Show line numbers
less -N file.txt

# Case-insensitive viewing
less -i file.txt

# View with line wrap disabled
less -S file.txt

more

A simple pager for viewing text one screen at a time (forward only).

more [options] file

Options:

Examples:

# View a file with more
more file.txt

# View multiple files
more file1.txt file2.txt

# View with helpful prompts
more -d file.txt

Displays the beginning of a file.

head [options] [file(s)]

Options:

Examples:

# Show first 10 lines
head file.txt

# Show first 20 lines
head -n 20 file.txt

# Show first 100 bytes
head -c 100 file.txt

# Show first 5 lines of multiple files
head -n 5 file1.txt file2.txt

tail

Displays the end of a file.

tail [options] [file(s)]

Options:

Examples:

# Show last 10 lines
tail file.txt

# Show last 20 lines
tail -n 20 file.txt

# Show last 100 bytes
tail -c 100 file.txt

# Follow file updates in real-time
tail -f /var/log/syslog

# Follow file updates even if file is rotated
tail -F /var/log/syslog

grep

Searches for patterns in text.

grep [options] pattern [file(s)]

Options:

Examples:

# Search for pattern in file
grep "error" logfile.txt

# Case-insensitive search
grep -i "error" logfile.txt

# Show line numbers
grep -n "error" logfile.txt

# Search recursively in directory
grep -r "TODO" /path/to/project/

# Show 2 lines before and after match
grep -C 2 "error" logfile.txt

# Show only filenames containing matches
grep -l "error" *.log

# Count occurrences
grep -c "error" logfile.txt

# Match whole words only
grep -w "error" logfile.txt

# Use regular expressions
grep -E "error|warning" logfile.txt

# Show lines that don't match
grep -v "success" logfile.txt

sed

Stream editor for filtering and transforming text.

sed [options] 'command' [file(s)]

Options:

Common commands:

Examples:

# Replace first occurrence of pattern in each line
sed 's/old/new/' file.txt

# Replace all occurrences of pattern in each line
sed 's/old/new/g' file.txt

# Replace on specific line number
sed '3s/old/new/' file.txt

# Replace in line range
sed '2,5s/old/new/g' file.txt

# Delete specific lines
sed '2,5d' file.txt

# Delete lines matching pattern
sed '/pattern/d' file.txt

# Print only lines matching pattern
sed -n '/pattern/p' file.txt

# Append text after line
sed '2a\New line text' file.txt

# Multiple commands
sed -e 's/old/new/g' -e '/pattern/d' file.txt

# Edit file in-place
sed -i 's/old/new/g' file.txt

# Edit file in-place with backup
sed -i.bak 's/old/new/g' file.txt

awk

Pattern scanning and processing language.

awk [options] 'program' [file(s)]

Options:

Program structure:

pattern { action }

Built-in variables:

Examples:

# Print specific columns (fields)
awk '{print $1, $3}' file.txt

# Use custom field separator
awk -F, '{print $1, $3}' csv_file.csv

# Print lines matching pattern
awk '/pattern/' file.txt

# Conditional actions
awk '$3 > 100 {print $1, $3}' file.txt

# Calculate sum
awk '{sum += $1} END {print "Sum:", sum}' numbers.txt

# Print line numbers and lines
awk '{print NR, $0}' file.txt

# Format output
awk '{printf "%-10s %s\n", $1, $2}' file.txt

# Process specific fields
awk -F: '{print "Username: " $1 ", Shell: " $NF}' /etc/passwd

# Multiple patterns and actions
awk '$1 == "ERROR" {print "Error on line", NR} 
     $1 == "WARNING" {print "Warning on line", NR}' logfile.txt

sort

Sorts lines of text files.

sort [options] [file(s)]

Options:

Examples:

# Simple sort
sort file.txt

# Numeric sort
sort -n numbers.txt

# Reverse sort
sort -r file.txt

# Sort and remove duplicates
sort -u file.txt

# Sort by specific column (field)
sort -k 2 file.txt

# Sort by 3rd field numerically
sort -k 3n file.txt

# Sort by multiple fields
sort -k 1,1 -k 2n file.txt

# Sort using custom delimiter
sort -t: -k 3n /etc/passwd

# Human-readable size sort
ls -lh | sort -k 5h

# Version number sort
sort -V versions.txt

# Sort and save result
sort file.txt -o sorted_file.txt

uniq

Reports or filters out repeated lines.

uniq [options] [input [output]]

Options:

Examples:

# Remove consecutive duplicate lines
uniq file.txt

# Count occurrences
uniq -c file.txt

# Show only duplicate lines
uniq -d file.txt

# Show only unique lines
uniq -u file.txt

# Ignore case when comparing
uniq -i file.txt

# Skip fields when comparing
uniq -f 2 file.txt

# Often used with sort
sort file.txt | uniq
sort file.txt | uniq -c

cut

Removes sections from each line.

cut [options] [file(s)]

Options:

Examples:

# Extract characters by position
cut -c 1-5 file.txt

# Extract specific character positions
cut -c 1,3,5-7 file.txt

# Extract fields from CSV file
cut -d, -f 1,3 file.csv

# Extract fields from colon-delimited file
cut -d: -f 1,7 /etc/passwd

# Extract everything except specified fields
cut -d, -f 2 --complement file.csv

paste

Merges lines of files horizontally.

paste [options] [file(s)]

Options:

Examples:

# Combine two files side by side (tab-separated)
paste file1.txt file2.txt

# Combine with custom delimiter
paste -d, file1.txt file2.txt

# Combine all files into one line per file
paste -s file1.txt file2.txt

# Combine with multiple delimiters (rotating)
paste -d ",:;" file1.txt file2.txt file3.txt

tr

Translates or deletes characters.

tr [options] set1 [set2]

Options:

Examples:

# Convert lowercase to uppercase
cat file.txt | tr 'a-z' 'A-Z'

# Delete specific characters
cat file.txt | tr -d 'aeiou'

# Squeeze repeated characters
cat file.txt | tr -s ' '

# Replace newlines with spaces
cat file.txt | tr '\n' ' '

# Remove all non-printable characters
cat file.txt | tr -cd '[:print:]'

# Replace multiple characters
cat file.txt | tr '{}' '()'

wc

Counts lines, words, and characters.

wc [options] [file(s)]

Options:

Examples:

# Count lines, words, and characters
wc file.txt

# Count only lines
wc -l file.txt

# Count only words
wc -w file.txt

# Count characters
wc -c file.txt

# Count multibyte characters
wc -m file.txt

# Get length of longest line
wc -L file.txt

# Count lines in multiple files
wc -l file1.txt file2.txt

# Count lines in all text files
wc -l *.txt

diff

Compares files line by line.

diff [options] file1 file2

Options: