TODO 1004 B

1234567891011121314151617181920212223
  1. * Add support for multiple targets.
  2. TARGETS can be analyzed in parallel but beware of race conditions when a
  3. conflict arises. Example: t1 in TARGET matches s1 in SOURCE and is stored in the
  4. structure. t2 in TARGET matches s1 too. There is a conflict. We need to update
  5. s1, t1 and t2 at the same time until t1 != t2 or end of file is reached.
  6. * If duplicate count is the same on both sides, we could still process. We
  7. should minimize the number of renames.
  8. * Multi-threading: The main structure should be mutexed, but the checksums can
  9. be parallelized.
  10. * Save on resident memory usage. Currently 200000 files in /usr will require
  11. ~100 MB. Shall we use a trie to store paths? Not sure it would save memory.
  12. * Possible optimization: we can skip target (sub)folders where
  13. os.SameFile(sourceFolder, targetFolder) == true. Then we need to store source's
  14. FileInfo in a map.
  15. * This program could be split in two: the analyzer and the renamer.
  16. * References: dupd, dupfinder, fdupes, gotsync, rmlint, rsync.