Hi all,

I’m having an issue with an NFS mount that I use for serving podcasts through audibookshelf. The issue has been ongoing for months, and I’m not sure where the problem is and how to start debugging.

My setup:

  • Unraid with NFS share “podcasts” set up
  • Proxmox on another machine, with VM running Fedora Server 40.
  • Storage set up in Fedora to mount the “podcasts” share on boot, works fine
  • docker container on the same Fedora VM has Audiobookshelf configured with the “podcasts” mount passed through in the docker-compose file.

The issue:

NFS mount randomly drops. When it does, I need to manually mount it again, then restart the Audiobookshelf container (or reboot the VM, but I have other services).

There doesn’t seem to be any rhyme or reason to the unmount. It doesn’t coincide to any scheduled updates or spikes in activity. No issue on the Unraid side that I can see. Sometimes it drops over night, sometimes mid day. Sometimes it’s fine for a week, other times I’m remounting twice a day. What has finally forced me to seek help is the other day I was listening to a podcast, paused for 10-15 mins and couldn’t restart the episode until I went through the manual mount procedure. I checked and it was not due to the disk sinning down.

I’ve tried updating everything I could, issue persists. I only just updated to Fedora 40. It was on 38 previously and initially worked for many months without issue, then randomly started dropping the NFS mounts (I tried setting up other share mounts and same problem). Update to 39, then 40 and issue persists.

I’m not great with logs but I’m trying to learn. Nothing sticks out so far.

Does anyone have any ideas how I can debug and hopefully fix this?

  • SpeakinTelnet@programming.dev
    link
    fedilink
    English
    arrow-up
    5
    ·
    edit-2
    15 days ago

    First thing I’d do is to look at the client (fedora) journal for anything funky happening.

    ‘sudo systemctl status nfs-client’

    Since it’s random I assume you won’t have any timeout in your /etc/fstab but it might be worth taking a look anyway.

    Be aware that if the network drops the NFS will be disconnected and won’t auto-reconnect so this could also be the issue.

    I don’t know if it plays well with container mounted volume, but looking at autofs could be a solution to auto-remount the share. I use it profusely for network mounted home directories.

  • schizo@forum.uncomfortable.business
    link
    fedilink
    English
    arrow-up
    2
    ·
    15 days ago

    I’m going to have to cut up my nerd card here, but I had similar issues with NFS exports from my roll-your-own build.

    After a month of troubleshooting I decided that working is better than purity so I just mounted the SMB shares instead and everything just worked going forward.

    Best I can tell, NFS is just very very finnicky when it comes to hardware accessibility (drive spun down, etc.), network reliability, and is just a lot less robust than other options. I never was able to trace why NFS was the one and only thing that never seemed to work right, but at least there’s other options as a workaround?

    • Joe@discuss.tchncs.de
      link
      fedilink
      English
      arrow-up
      4
      ·
      15 days ago

      NFSv3 (udp, stateless) was always as reliable as the network infra under Linux, I found. NFSv4 made things a bit more complicated.

      You don’t want any NAT / stateful connection tracking in the network path (anything that could hiccup and forget), and wired connections only for permanent storage mounts, of course.

      • schizo@forum.uncomfortable.business
        link
        fedilink
        English
        arrow-up
        2
        ·
        14 days ago

        Yeah it was NAS -> DAC -> Switch -> endpoints and for whatever reason, for some use cases, it would just randomly hiccup and break shit.

        I could never figure out what the problem was and as far as I could tell there was nothing in the network path that stopped working or flapped or whatever unless it did it so fast it didn’t trigger any monitoring stuff, yet somehow still broke NFS (and only NFS).

        Figured after a bit that since everything else seemed fine, and the data was being exported via like 6 other methods, that meh, I’ll just use something else.

    • phanto@lemmy.ca
      link
      fedilink
      English
      arrow-up
      4
      ·
      15 days ago

      I did the hackiest, lamest thing back in the day… I had my client write the current date and time to a file on the share every two minutes as a Cron job… Kept it working for months! I saw it on a forum somewhere, tried it, and… Shocked Pikachu face I don’t know if I ever disabled that Cron job! Haha!

        • phanto@lemmy.ca
          link
          fedilink
          English
          arrow-up
          3
          ·
          14 days ago

          I checked, it’s still there! (It doesn’t append, it overwrites, so no, I just have a file with the current date and time accurate to within two minutes.)

  • 2xsaiko@discuss.tchncs.de
    link
    fedilink
    English
    arrow-up
    1
    ·
    15 days ago

    Never seen this before, but you can enable NFS debugging with ‘rpcdebug -m nfs -s all’ (or nfsd on the server, or rpc for the underlying protocol). It prints to dmesg.