Simple way to remove duplicate entries from your history file

sorts by url, then uniques (ignoring timestamps) and sorts again. result: new history file where you only have the earliest visit of all url's you visited more then once.

sort -k3 history | uniq -f2 | sort > history.new

Sort by URL, sort by number of occurrences of each URL, take the 5000 most visited:

sort -k3 history | uniq -f2 -c | sort -n | sed 's/[0-9 ]* //' | tail -n 5000 > history.new
 
clean_history.txt · Last modified: 2011/03/08 19:38 by bct
 
Except where otherwise noted, content on this wiki is licensed under the following license:CC Attribution-Noncommercial-Share Alike 3.0 Unported
Recent changes RSS feed Donate Powered by PHP Valid XHTML 1.0 Valid CSS Driven by DokuWiki