summary |
shortlog | log |
commit |
commitdiff |
tree
first ⋅ prev ⋅ next
Zygo Blaxell [Wed, 6 Jan 2010 16:56:31 +0000 (11:56 -0500)]
dupemerge: update copyright year to 2010
Zygo Blaxell [Sat, 9 Jan 2010 01:51:45 +0000 (20:51 -0500)]
Merge branch 'performance'
Conflicts:
faster-dupemerge
Zygo Blaxell [Sat, 9 Jan 2010 01:08:45 +0000 (20:08 -0500)]
Update copyright year and email address
It helps my spam filter if I can keep track of which web page the
spammers have scraped my email address from.
Zygo Blaxell [Sat, 9 Jan 2010 01:08:45 +0000 (20:08 -0500)]
Work around new fileutils output
findutils now appends a redundant ".
000000000" to the %T@ output.
I've apparently missed the window to get findutils to fix this, so
I've worked around it.
Zygo Blaxell [Sat, 9 Jan 2010 02:21:44 +0000 (21:21 -0500)]
Update copyright year
root [Sun, 26 Nov 2006 22:05:51 +0000 (22:05 +0000)]
Properly handle cases where multiple files have the same hash
(e.g. because --skip-hash is used). This version now generates all N^2
combinations of comparisons.
git-svn-id: svn+ssh://svn.furryterror.org/r/trunk/mokona/zblaxell@6218
a5e33b96-951a-0410-ae88-
c0fe16d076bb
git-svn-id: file:///root/SVN@4
f049ffa3-53c0-42dd-8896-
c8778eaba0c5
git-svn-id: file:///root/SVN@10
f049ffa3-53c0-42dd-8896-
c8778eaba0c5
Zygo Blaxell [Wed, 6 Jan 2010 16:10:04 +0000 (11:10 -0500)]
dupemerge: maybe improve seek performance by sorting perl hashes
Thank Johannes Niess <Linux@johannes-niess.de> for this idea.
To improve seek performance, choose inodes for linking in a fixed order.
This will mean that two directories with multiple identical files will
end up with links to the copies with lower inode numbers. This is an
improvement over the previous result, which was that both directories
would end up with randomly chosen files from both directories.
The sort order isn't strictly numeric; however, it's hopefully close
enough.
As a crude heuristic, we assume that inode numbers approximate file
position on disk, and file names approximate typical usage patterns.
Previously we used the perl hash semantics, which are mostly random
and might change depending on the numbers of files considered.
Zygo Blaxell [Wed, 6 Jan 2010 16:07:24 +0000 (11:07 -0500)]
dupemerge: have find tell us the device too
faster-dupemerge cannot be used to link files on multiple filesystems
because the hardlinks will fail; however, if this is attempted anyway
then files with identical weak keys (size+timestamp+permissions) and
identical inode numbers might be considered as identical for hashing
and comparing purposes when they are not. That would be bad.
Zygo Blaxell [Sat, 9 Jan 2010 01:59:05 +0000 (20:59 -0500)]
Remove ad-hoc copyright notice, add formal copyright statement and GPL
git-svn-id: svn+ssh://svn.furryterror.org/r/trunk/mokona/zblaxell@3269
a5e33b96-951a-0410-ae88-
c0fe16d076bb
Zygo Blaxell [Sat, 9 Jan 2010 01:58:48 +0000 (20:58 -0500)]
tick_quote: properly quote the string '\''
cvs [Sat, 7 Jan 2006 08:44:02 +0000 (08:44 +0000)]
Implement --dry-run and --humane options
git-svn-id: svn+ssh://svn.furryterror.org/r/trunk/mokona/zblaxell@4518
a5e33b96-951a-0410-ae88-
c0fe16d076bb
cvs [Mon, 5 May 2003 04:20:14 +0000 (04:20 +0000)]
digest: Fix incorrect statistics when hashes fail
An order-of-operations bug can lead to files being counted as hashed
when they are not (e.g. due to I/O error or the file disappearing).
Calculate the digest, then increment the statistics.
git-svn-id: svn+ssh://svn.furryterror.org/r/trunk/mokona/zblaxell@3332
a5e33b96-951a-0410-ae88-
c0fe16d076bb
Zygo Blaxell [Sat, 9 Jan 2010 01:04:54 +0000 (20:04 -0500)]
Replace --trust with --skip-compare and add --skip-hash and copyright statement
git-svn-id: svn+ssh://svn.furryterror.org/r/trunk/mokona/zblaxell@3225
a5e33b96-951a-0410-ae88-
c0fe16d076bb
Conflicts:
faster-dupemerge
root [Tue, 23 Dec 2008 19:52:04 +0000 (14:52 -0500)]
Initial commit