• 0 Posts
  • 43 Comments
Joined 1 year ago
cake
Cake day: June 1st, 2023

help-circle




  • Never ask a man his pay, a woman her weight/age, or a data horder the contents of their stash.

    Jk. Mostly.

    I have a similar-ish set up to @Davel23 , I have a couple of cool use cases.

    • I seed the last 5 arch and opensuse (a few different flavors) ISOs at all times

    • I run an ArchiveTeam warrior for archive.org

    • I scan nontrivial mail (the paper kind) and store it in docspell for later OCR searches, tax purposes etc.

    • I help keep Sci-Mag healthy

    • I host several services for de-googling, including Nextcloud, Blocky, Immich, and Searxng

    • I run Navidrome, that has mostly (and hopefully will soon completely) replace Spotify for my family.

    • I run Plex (hoping to move to Jellyfin sometime, but there’s inertial resistance to that) that has completely replaced Disney streaming, Netflix streaming, etc for me and my extended family.

    • I host backups for my family and close friends with an S3 and WebDAV backup target

    • I run Frigate on a few PoE cameras in the forest behind my house to check out wildlife

    • I use the audio streams from my cameras to check for birdsong, identify birds, and archive and submit the detections to a citizen science website (https://app.birdweather.com)

    I run 4x14TB, 2x8TB, 2x4TB, all from serverpartsdeals, in a ZFS RAID10 with two 1TB cache dives, so half of the spinning rust usable at ~35TiB, and right now I’m at 62% utilization. I usually expand at about 85%




  • It was the bad old days of sysadmin, where literally every critical service ran on an iron box in the basement.

    I was on my first oncall rotation. Got my first call from helpdesk, exchange was down, it’s 3AM, and the oncall backup and Exchange SMEs weren’t responding to pages.

    Now I knew Exchange well enough, but I was new to this role and this architecture. I knew the system was clustered, so I quickly pulled the documentation and logged into the cluster manager.

    I reviewed the docs several times, we had Exchange server 1 named something thoughtful like exh-001 and server 2 named exh-002 or something.

    Well, I’d reviewed the docs and helpdesk and stakeholders were desperate to move forward, so I initiated a failover from clustered mode with 001 as the primary, instead to unclustered mode pointing directly to server 10.x.x.xx2

    What’s that you ask? Why did I suddenly switch to the IP address rather than the DNS name? Well that’s how the servers were registered in the cluster manager. Nothing to worry about.

    Well… Anyone want to guess which DNS name 10.x.x.xx2 was registered to?

    Yeah. Not exh-002. For some crazy legacy reason the DNS names had been remapped in the distant past.

    So anyway that’s how I made a 15 minute outage into a 5 hour one.

    On the plus side, I learned a lot and didn’t get fired.






  • Yeah, you should be scrubbing weekly or monthly, depending on how often you are using the data. Scrub basically touches each file and checks the checksums and fixes any errors it finds proactively. Basically preventative maintenance.
    https://manpages.ubuntu.com/manpages/jammy/man8/zpool-scrub.8.html

    Set that up in a cron job and check zpool status periodically.

    No dedup is good. LZ4 compression is good. RAM to disk ratio is generous.

    Check your disk’s sector size and vdev ashift. On modern multi-TB HDDs you generally have a block size of 4k and want ashift=12. This being set improperly can lead to massive write amplification which will hurt throughput.
    https://www.high-availability.com/docs/ZFS-Tuning-Guide/

    How about snapshots? Do you have a bunch of old ones? I highly recommend setting up a snapshot manager to prune snapshots to just a working set (monthly keep 1-2, weekly keep 4, daily keep 6 etc) https://github.com/jimsalterjrs/sanoid

    And to parrot another insightful comment, I also recommend checking the disk health with SMART tests. In ZFS as a drive begins to fail the pool will get much slower as it constantly repairs the errors.



  • ZFS is a very robust choice for a NAS. Many people, myself included, as well as hundreds of businesses across the globe, have used ZFS at scale for over a decade.

    Attack the problem. Check your system logs, htop, zpool status.

    When was the last time you ran a zpool scrub? Is there a scrub, or other zfs operation in progress? How many snapshots do you have? How much RAM vs disk space? Are you using ZFS deduplication? Compression?



  • Shinedown The Sound of Madness

    I love all of their albums, but there’s something truly special about the musicality and emotional impact of the songs and album as a whole for The Sound of Madness

    There subsequent albums are fantastic, but have more ups and downs than the consistent high bar of The Sound of Madness

    Similarly, I think Disturbed Immortalized was Disturbed’s peak work, with The Light and The Sound of Silence being incredible peaks of their signature style and highly musical storytelling on a fantastic album.