https://hub.docker.com/r/sciactive/nephele

In the latest version of Nephele, you can now create a WebDAV server that deduplicates files that you add to it.

I created this feature because every night at midnight, my Minecraft world that my friends and I play on gets backed up. Our world has grown to about 5 GB, but every night, the same files get backed up over and over. It’s a waste of space to store the same files again and again, but I want the ability to roll back our world to any day in the past.

So with this new feature of Nephele, I can upload the Minecraft backup and only the files that have changed will take up additional space. It’s like having infinite incremental backups that never need a full backup after the first time, and can be accessed instantly.

Nephele will only delete a file from the file storage once all copies that share the same file contents have been deleted, so unlike with most incremental backup solutions, you can delete previous backups easily and regain space.

Edit: So, I think my post is causing some confusion. I should make it clear that my use case is specific for me. This is a general purpose deduplicating file server. It will take any files you give it and deduplicate them in its storage. It’s not a backup system, and it’s not a versioning system. My use case is only one of many you can use a deduplicating file server for.

  • Lem453@lemmy.ca
    link
    fedilink
    English
    arrow-up
    1
    ·
    5 months ago

    Start with this to learn how snapshots work

    https://fedoramagazine.org/working-with-btrfs-snapshots/

    Then here the learn how to make automatic snapshots with retention

    https://ounapuu.ee/posts/2022/04/05/btrfs-snapshots/

    I do something very similar with zfs snapshots and deduplication on. I have one ever 5 mins and save 1 hr worth then save 24 hourlys every day and 1 day for a month etc

    For backup to remote locations you can send a snapshot offsite

    • hperrin@lemmy.worldOP
      link
      fedilink
      English
      arrow-up
      0
      ·
      edit-2
      5 months ago

      Having a separate tool do the work of making a snapshot doesn’t mean what I said is wrong. Snapshots are not automatic, with regard to btrfs. You can have a tool automatically make a snapshot, but btrfs won’t do it for you.

      My overall point is that a deduplicating file server has very little in common with btrfs snapshots. The original commenter looked at my use case for my own deduplicating file server and assumed that the server was the same thing as my use case.

      I think if they took the time to look at the server and see what it is actually doing, they would see that it is very different from btrfs.

      • Lem453@lemmy.ca
        link
        fedilink
        English
        arrow-up
        0
        ·
        edit-2
        5 months ago

        I use zfs so not sure about others but I thought all cow file systems have deduplication already? Zfs has it turned on by default. Why make your own file deduplication system instead of just using a zfs filesystem and letting that do the work for you?

        Snapshots are also extremely efficient on cow filesystems like zfs as they only store the diff between the previous state and the current one so taking a snapshot every 5 mins is not a big deal for my homelab.

        I can easily explore any of the snapshots and pull any file from and of the snapshots.

        I’m not trying to shit on your project, just trying to understand its usecase since it seems to me ZFS provides all the benefits already

        • hperrin@lemmy.worldOP
          link
          fedilink
          English
          arrow-up
          1
          ·
          5 months ago

          Btrfs does not have its own built in deduplication like zfs does. I’m surprised zfs has it turned on by default, considering file system level deduplication is fairly CPU and RAM intensive. But yeah, if you can use a deduplicated file system, go for it.

          In my use case, I’m not willing to move away from ext4 (on my home server, which is where this is running), and I don’t need all files on my file system to be deduplicated, just a set of files that I add to every day. I made this because it fits my use cases better than any other solution (this current use case, and some more I’m planning to implement in the future).

          As far as using snapshots to implement my current use case, it’s not possible. My Minecraft server runs on a different system than where I put my backups, and I want it that way. They are meant to be backups, not versions, and backups shouldn’t be stored on the same system. That server has also been migrated several times since I first started running it in 2019. I have back ups that go that far back too. So I need a system that I can put years worth of existing backups into, not just start taking backups now.