TECH SOLUTIONS

Click here to edit subtitle

Forums

Post Reply
Forum Home > Unix Learnings > Shell Script: Finding unique lines of a file without sorting the file

Sourav Gulati
Site Owner
Posts: 83

Let's take an example file as follows:

$ cat sortfile

AAAA

FFFF

BBBB

BBBB

CCCC

AAAA

FFFF

DDDD


If you want to find unique lines of this file, it can be doone as


$ cat sortfile|sort|uniq

AAAA

BBBB

CCCC

DDDD

FFFF


However, it changes the order of occurence of lines.


Following "awk" command will find unique lines of a file without sorting the file:

$ awk '!x[$0]++' sortfile

AAAA

FFFF

BBBB

CCCC

DDDD


awk 'x[$0]++' keep on incrementing the hash array "x" of line from 0 or undefined to 1,2 and so on. Since there is no code for awk in {}, so it by default prints the line. When ! is appended in front of x[$0]++ , it will only print line when value of its hash array "x" is 0 or undefined. 


--


February 4, 2013 at 7:15 AM Flag Quote & Reply

You must login to post.