Stream text processing
cat
Reads a text file
Some useful options:
- -n show line numbers
- -b show non blank line numbers
- -s split multiple break lines into single break line
- -A shows special chars
tac (the opposite of cat)
Reads a text file from the last line to the first
head
Reads the first 10 lines of the file (some distros uses 5 by default)
Some useful options:
- -n<number> shows this number of lines
- -<number> shows this number of lines
- -c<number> show this number of bytes
tail
Reads the last 10 lines of the file (some distros uses 5 by default)
Some useful options:
- -n<number> shows this number of lines
- -<number> shows this number of lines
- -c<number> show this number of bytes
- -f will read the last lines and keep watching for new inputs
less
Reads a file by paginating it, you can navigate with the arrows and pg up/pg down.
By typing / you can search a pattern, with n you go to the next occurrence and with tipping N (or p) it will go to the past occurrence.
wc
Reads the amount of lines, words and chars. Some useful options:
- -l shows the amount of lines
- -c or --bytes shows the amount of bytes
- -m or --chars shows the amount of chars
nl
Similar to the cat -b, will enumerate file lines and not counting empty lines.
sort
Will sort the lines by alphabetical order. Some useful options:
- -r will reverse the sort output
- -k<number> will sort by the second field (after the first space/tab)
uniq
Shows unique occurrences of the lines removing lines that already have been shown in sequence (if two lines have the same content but the lines aren't one before the other it will show again). Some useful options:
- -d shows only the duplicate lines
- -c count how many times the line had occured
od (octal dump)
Reads the file as octal Some useful options:
- -tx or -t x2 shows in hexadecimal
join
Bond to files using an index thats by default the first field.
(e.g.)
sh-5.2$ cat users 1 user1 2 user2 3 user3 sh-5.2$ cat rating 1 4.9 2 4.7 3 4.5 sh-5.2$ join users rating 1 user1 4.9 2 user2 4.7 3 user3 4.5
Some useful options:
- -j<number> specifies what field will be the index field
paste
Will bond each line of each file together.
(e.g.)
sh-5.2$ cat users 1 user1 2 user2 3 user3 sh-5.2$ cat rating 1 4.9 2 4.7 3 4.5 sh-5.2$ paste users rating 1 user1 1 4.9 2 user2 2 4.7 3 user3 3 4.5
split
Splits a file into multiple files by default split in 1000 lines files like xaa, xab xac and etc... Some useful options:
- -l<number> or -<number> how much lines will be the output files.
- b<number> how much bytes will be the output files.
(e.g.)
sh-5.2# ls -l total 56 -rw-r--r-- 1 root root 56827 Aug 6 19:33 test_file sh-5.2# wc -l test_file 591 test_file sh-5.2# split -l300 test_file sh-5.2# ls -l total 116 -rw-r--r-- 1 root root 56827 Aug 6 19:33 test_file -rw-r--r-- 1 root root 27759 Aug 6 19:37 xaa -rw-r--r-- 1 root root 29068 Aug 6 19:37 xab sh-5.2# split -l300 test_file renamed_output_ sh-5.2# ls -l total 176 -rw-r--r-- 1 root root 27759 Aug 6 19:37 renamed_output_aa -rw-r--r-- 1 root root 29068 Aug 6 19:37 renamed_output_ab -rw-r--r-- 1 root root 56827 Aug 6 19:33 test_file -rw-r--r-- 1 root root 27759 Aug 6 19:37 xaa -rw-r--r-- 1 root root 29068 Aug 6 19:37 xab
tr
Will override characters from inputs with other characters, will only read from stdin.
(e.g.)
sh-5.2$ cat users 1 user1 2 user2 3 user3 sh-5.2$ cat users | tr [:lower:] [:upper:] 1 USER1 2 USER2 3 USER3 sh-5.2$ cat users | tr ' ' '_' 1_user1 2_user2 3_user3 sh-5.2$ cat users | tr e J 1 usJr1 2 usJr2 3 usJr3 sh-5.2$ cat users | tr -d u 1 ser1 2 ser2 3 ser3
Some useful options:
- -d Will delete the occurrence.