pb
Joined: 04 Sep 2007 Posts: 32
|
Posted: Tue Sep 04, 2007 5:11 am Post subject: diff based backup |
|
|
I use VV both for syncing and for backup. This is about backup.
My most active large files for backup are mail files. These are all ascii text based files. Very often a 300MB mail file might have one or two messages added or deleted, and must be backed up because it has changed. When that happens, 300MB are copied, and in my case, 300MB are also archived (the old backed up version gets archived).
It would be vastly more efficient to use diff, or even better, an open source version control system, to save only the incremental changes to such files. This would yield many orders of magnitude improvement in both storage and speed. Presumably, as programmers, you already use version control. To back up a file which was a candidate for this method, VV would secretly just "check it in", which would just create a diff in a version control system.
This would be a little more annoying to the end user as far as looking at the backup directory, since it would just have a lot of diffs. But, you could provide a tool to make it look more "normal".
Of course, it would be under user control whether he wants this type of backup, assumptions could be made based on extension (doing a diff would tell you whether the diff is significantly smaller than the whole file, too), and the user could mark certain files or folders to be backed up this way.
In the long run, this idea can be taken farther. For example, other file types are not line-oriented, but they might have special software available to determine the diffs. E.g., pdf files. Thus, this could be user-extensible if the user provides a suitable diff program for his favorite file types.
--peter |
|