• 7 Posts
  • 123 Comments
Joined 2 years ago
cake
Cake day: June 17th, 2023

help-circle
    • Ever tested restoring those backups? Do you have the exact procedure written down? Does it still work? If the service gets compromised/data corrupted on sunday, and your backup runs, do you still have a non-compromised backup and how old is it?
    • How timely can you deal with security fixes, and how will you be alerted that a security fix is available?
    • How do you monitor your services for resource availability, errors in logs, security events?
    • How much downtime is acceptable for routine maintenance, and for incidents?
    • Do you have tooling to ensure you can redeploy the exact same configuration to another host?
    • How do you test upgrades before pushing them to production?

    Not saying this is impossible, you just need to have these questions in mind, and the answers written down before you start charging people for the service, and have the support infrastructure ready.

    Or you can just provide the service for free, best-effort without guarantees.

    I do both (free services for a few friends, paid by customers at $work, small team). Most of the time it’s smooth riding but it needs preparation (and more than 1 guy to handle emergencies - vacations, bus factor and all that).

    For the git service I can recommend gitea + gitea-actions (I run the runners in podman). Gitlab has more features but it can be overwhelming if you don’t need them, and it requires more resources.



  • Fail2ban is a Free/Open-Source program to parse logs and take action based on the content of these logs. The most common use case is to detect authentication failures in logs and issue a firewall level ban based on that. It uses regex filters to parse the logs and policies called jails to determine which action to take (wait for more failures, run command xyz…). It’s old, basic, customizable, does its job.

    crowdsec is a commercial service [1] with a free offering, and some Free/Open-Source components. The architecture is quite different [2], it connects to Crowdec’s (the company) servers to crowd-source detections, their service establishes a “threat score” for each IP based on detections they receive, and in exchange they provide [3] some of these threat feeds/blocklists back to their users. A separate crowdsec-bouncer process takes action based on your configuration.

    If you want to build your own private shared/global blocklist based on crowdsec detections, you’ll need to setup a crowdsec API server and configure all your crowdsec instances to use it. If you want to do this with fail2ban you’ll need to setup your own sync mechanism (there are multiple options, I use a cron job+script that pulls IPs from all fail2ban instances using fail2ban-client status, builds an ipset, and pushes it to all my servers). If you need crowdsourced blocklists, there are multiple free options ([4] can be used directly by ipset).

    Both can be used for roughly the same purpose, but are very different in how they work and the commercial model (or lack of) behind the scenes.








  • Data loss is not a problem specific to self-hosting.

    Whenever you administrate a system that contains valuable data (a self-hosted network service/application, you personal computer, phone…), think about a backup and recovery strategy for common (and less common) data loss cases:

    1. you delete a valuable file by accident
    2. a bad actor deletes or encrypts the data (ransomware)
    3. the device gets stolen, or destroyed (hardware failure, power surge, fire, flood, hosting provider closing your account)
    4. anything you can think of

    For these different scenarios try to find a working backup/restore strategy. For me they go like

    1. Automatic, daily local backups (anything on my server gets backed up once a day to a backups directory using rsnapshot). Note that file sync like nextcloud won’t protect you against this risk, if you delete a file on the nextcloud client it’s also gone on the Nextcloud server (though there is a recycle bin). Local backups are quick and easy to restore after a simple mistake like this. They wont protect you against 2 and 3.
    2. Assuming an attacker gains access to your machine they will also destroy or encrypt your local backups. My strategy against this is to pull a copy of the latest local backup, weekly, to a USB drive, through another computer, using rsync/rsnapshot. Then I unplug the USB drive, store it somewhere safe outside my home, and plug in a second USB drive. I rotate the drives every week (or every 2 weeks when I’m lazy - I have set up a notification to nag me to rotate the drive every saturday, but I sometimes ignore it)
    3. The USB strategy also protects me against 3. If both my server and main computer burn down, the second drive is still out there, safely encrypted. It’s the worst case scenario, I’d probably spend quite some time setting up everything again (though most of the setup is automated), and at this point I’d have bigger problems like, you know, burned down house. But I’d still have my data.

    There are other strategies, tools, etc, this one works for me. It’s cheap (the USB drives are a one-time investment), the only manual step is to rotate the drives every week or so.




    • step 1: use named volumes
    • step 2: stop your containers or just wait for them to crash/stop unnoticed for some reason
    • step 3: run docker system prune --all as one should do periodically to clean up the garbage docker leaves on your system. Lose all your data (this will delete even named volumes if they are not in use by a running container)
    • step 4: never use named or anonymous volumes again, use bind mounts

    The fact that you absolutely need to run docker system prune --all regularly to get rid of GBs of unused layers, test containers, etc, combined with the fact that it deletes explicitely named volumes makes them too unsafe for my taste. Just use bind mounts.