I realise this is a known issue and that lemmy.world isn’t the only instance that does this. Also, I’m aware that there are other things affecting federation. But I’m seeing some things not federate, and can’t help thinking that things would be going smoother if all the output from the biggest lemmy instance wasn’t 50% spam.

Hopefully this doesn’t seem like I’m shit-stirring, or trying to make the Issue I’m interested in more important than other Issues. It’s something I mention occasionally, but it might be a bit abstract if you’re not the admin of another instance.

The red terminal is a tail -f of the nginx log on my server. The green terminal is outputting some details from the ActivityPub JSON containing the Announce. You should be able to see the correlation between the lines in the nginx log, and lines from the activity, and that everything is duplicated.

This was generated by me commenting on an old post, using content that spawns an answer from a couple of bots, and then me upvoting the response. (so CREATE, CREATE, LIKE, is being announced as CREATE, CREATE, CREATE, CREATE, LIKE, LIKE). If you scale that up to every activity by every user, you’ll appreciate that LW is creating a lot of work for anyone else in the Fediverse, just to filter out the duplicates.

  • bamboo@lemmy.blahaj.zone
    link
    fedilink
    English
    arrow-up
    38
    arrow-down
    1
    ·
    3 months ago

    Are you able to include the HTTP Method being called and the amount of data transferred per request? It’s possible that the first request is an OPTION request and then the second request is a POST.

    If you can see the amount of data transferred, then you can have some more indication that double the requests are being sent and quantity the bandwidth impact at least.

    • freamon@lemmy.worldOP
      link
      fedilink
      English
      arrow-up
      22
      ·
      3 months ago

      They’ll all POST requests. I trimmed it out of the log for space, but the first 6 requests on the video looked like (nginx shows the data amount for GET, but not POST):

      ip.address - - [07/Apr/2024:23:18:44 +0000] "POST /inbox HTTP/1.1" 200 0 "-" "Lemmy/0.19.3; +https://lemmy.world"
      ip.address- - [07/Apr/2024:23:18:44 +0000] "POST /inbox HTTP/1.1" 200 0 "-" "Lemmy/0.19.3; +https://lemmy.world"
      ip.address - - [07/Apr/2024:23:19:14 +0000] "POST /inbox HTTP/1.1" 200 0 "-" "Lemmy/0.19.3; +https://lemmy.world"
      ip.address - - [07/Apr/2024:23:19:14 +0000] "POST /inbox HTTP/1.1" 200 0 "-" "Lemmy/0.19.3; +https://lemmy.world"
      ip.address - - [07/Apr/2024:23:19:44 +0000] "POST /inbox HTTP/1.1" 200 0 "-" "Lemmy/0.19.3; +https://lemmy.world"
      ip.address - - [07/Apr/2024:23:19:44 +0000] "POST /inbox HTTP/1.1" 200 0 "-" "Lemmy/0.19.3; +https://lemmy.world"
      

      If I was running Lemmy, every second line would say 400, from it rejecting it as a duplicate. In terms of bandwidth, every line represents a full JSON, so I guess it’s about 2K minimum for the standard cruft, plus however much for the actual contents of comment (the comment replying to this would’ve been 8K)

      My server just took the requests and dumped the bodies out to a file, and then a script was outputting the object.id, object.type and object.actor into /tmp/demo.txt (which is another confirmation that they were POST requests, of course)

  • tedu@azorius.net
    link
    fedilink
    arrow-up
    19
    ·
    3 months ago

    I see the most duplicated activities from programming.dev and mander.xyz, but it happens a lot.

    • freamon@lemmy.worldOP
      link
      fedilink
      English
      arrow-up
      3
      ·
      3 months ago

      I’ve been coerced into reporting it as bug in Lemmy itself - perhaps you could add your own observations here so I seem like less of a crank. Thanks.

  • freamon@lemmy.worldOP
    link
    fedilink
    English
    arrow-up
    2
    ·
    3 months ago

    Update: for LW, this behaviour stopped around about Friday 12th April. Not sure what changed, but at least the biggest instance isn’t doing it anymore.

    • freamon@lemmy.worldOP
      link
      fedilink
      English
      arrow-up
      7
      ·
      edit-2
      3 months ago

      We were typing at the same time, it seems. I’ve included more info in a comment above, showing that they were POST requests.

      Also, the green terminal is outputting part of the body of for each request, to demonstrate. If they weren’t POST requests to /inbox, my server wouldn’t have even picked up them.

      EDIT: by ‘server’ I mean the back-end one, the one nginx is reverse-proxying to.

  • bob@lemmy.ml
    link
    fedilink
    English
    arrow-up
    1
    arrow-down
    4
    ·
    3 months ago

    I’m curious why there isn’t (as far as I’m aware at the moment) to prohibit the ability to respond to a post 3+ years ago

    • XNX@slrpnk.net
      link
      fedilink
      English
      arrow-up
      4
      ·
      3 months ago

      Why would there be? Old threads can be very useful after years and discussion can continue especially with software related threads. I find lots of bug fixes for stuff on reddit years ago that gets updated by a single person posting years later with a fix for something on the subreddits that dont annoyingly auto archive “old” posts

    • Die4Ever@programming.dev
      link
      fedilink
      English
      arrow-up
      2
      ·
      3 months ago

      Being able to respond to old posts is a good thing, like classic forums. I always hated that Reddit didn’t allow you to do that, and Reddit also didn’t have sort options for New Comments or Active.

      Imagine if someone made a post about a tech issue, it ranked high on Google results, lots of people in the comments with the same issue, and you found the solution, but the post was too old to reply to.