We have paused all crawling as of Feb 6th, 2025 until we implement robots.txt support. Stats will not update during this period.

  • Boomer Humor Doomergod@lemmy.world
    link
    fedilink
    English
    arrow-up
    3
    ·
    3 months ago

    Robots.txt is a lot like email in that it was built for a far simpler time.

    It would be better if the server could detect bots and send them down a rabbit hole rather than trusting randos to abide by the rules.

  • corsicanguppy@lemmy.ca
    link
    fedilink
    English
    arrow-up
    2
    ·
    3 months ago

    stoped

    Well, they needed to stope. Stope, I said. Lest thy carriage spede into the crosseth-rhodes.

    • mesa@lemmy.worldOP
      link
      fedilink
      English
      arrow-up
      2
      ·
      3 months ago

      No idea honestly. If anyone knows, let us know! I dont think its necessarily a bad thing, If their crawler was being too aggressive, then it can accidentally DDOS smaller servers. Im hoping that is what they are doing and respecting the robot.txt that some sites have.

      • Ada@lemmy.blahaj.zone
        link
        fedilink
        English
        arrow-up
        3
        ·
        3 months ago

        Gotosocial has a setting in development that is designed to baffle bots that don’t respect robots.txt. FediDB didn’t know about that feature and thought gotosocial was trying to inflate their stats.

        In the arguments that went back and forth between the devs of the apps involved, it turns out that FediDB was ignoring robots.txt. ie, it was badly behaved