• onlinepersona@programming.dev
    link
    fedilink
    arrow-up
    1
    ·
    3 months ago

    I see the “just create an account” and “just login” crowd have joined the discussion. Some people will defend a monopolist no matter what. If github introduced ID checks à la Google or required a Microsoft account to login, they’d just shrug and go “create a Microsoft account then, stop bitching”. They don’t realise they are being boiled and don’t care. Consoomer behaviour.

    Anti Commercial-AI license

    • calcopiritus@lemmy.world
      link
      fedilink
      arrow-up
      0
      arrow-down
      1
      ·
      3 months ago

      Or we just realize that GitHub without logging in is a service we are getting for free. And when there’s something free, there’s someone trying to exploit it. Using GitHub while logged in is also free and has none of these limits, while allowing them to much easier block exploiters.

      • onlinepersona@programming.dev
        link
        fedilink
        English
        arrow-up
        1
        ·
        3 months ago

        I would like to remind you that you are arguing for a monopolist. I’d agree with you if it were for a startup or mid-sized company that had lots of competition and was providing a good product being abused by competitors or users. But Github has a quasi-monopoly, is owned by a monopolist that is part of the reason other websites are being bombarded by requests (aka, they are part of the problem), and you are sitting here arguing that more people should join the monopoly because of an issue they created.

        Can you see the flaws in reasoning in your statements?

        Anti Commercial-AI license

        • calcopiritus@lemmy.world
          link
          fedilink
          arrow-up
          1
          arrow-down
          1
          ·
          3 months ago

          No. I cannot find the flaws in my reasoning. Because you are not attacking my reasoning, you are saying that i am on the side of the bad people, and the bad people are bad, and you are opposed to the bad people, therefore you are right.

          The world is more than black or white. GitHub rate-limiting non-logged-in users makes sense, and is the expected result in the age of web scrapping LLM training.

          Yes, the parent company of GitHub also does web scrapped for the purpose of training LLMs. I don’t see what that has to do with defending themselves from other scrappers.

          • onlinepersona@programming.dev
            link
            fedilink
            arrow-up
            1
            arrow-down
            1
            ·
            edit-2
            3 months ago

            Company creates problem. Requires users to change because of created problem. You defend company creating problem.

            That’s the logical flaw.

            If you see no flaws in defending a monopolist, well, you cannot be helped then.

            Anti Commercial-AI license

            • calcopiritus@lemmy.world
              link
              fedilink
              arrow-up
              1
              arrow-down
              1
              ·
              3 months ago

              I don’t think Microsoft invented scrapping. Or LLM training.

              Also, GitHub doesn’t have an issue with Microsoft scraping its data. They can just directly access whatever data they want. And rate-limiting non logged in accounts won’t affect Microsoft’s LLM training at all.

              I’m not defending a monopolist because of monopolist actions. First of all because GitHub doesn’t have any kind of monopoly. There are plenty of git forges. And second of all. How does this make their position on the market stronger? If anything, it makes it weaker.

  • varnia@lemm.ee
    link
    fedilink
    arrow-up
    1
    ·
    3 months ago

    Good thing I moved all my repos from git[lab|hub] to Codeberg recently.

  • Lv_InSaNe_vL@lemmy.world
    link
    fedilink
    arrow-up
    1
    ·
    edit-2
    3 months ago

    I honestly don’t really see the problem here. This seems to mostly be targeting scrapers.

    For unauthenticated users you are limited to public data only and 60 requests per hour, or 30k if you’re using Git LFS. And for authenticated users it’s 60k/hr.

    What could you possibly be doing besides scraping that would hit those limits?

    • chaospatterns@lemmy.worldOP
      link
      fedilink
      English
      arrow-up
      2
      ·
      edit-2
      3 months ago

      You might behind a shared IP with NAT or CG-NAT that shares that limit with others, or might be fetching files from raw.githubusercontent.com as part of an update system that doesn’t have access to browser credentials, or Git cloning over https:// to avoid having to unlock your SSH key every time, or cloning a Git repo with submodules that separately issue requests. An hour is a long time. Imagine if you let uBlock Origin update filter lists, then you git clone something with a few modules, and so does your coworker and now you’re blocked for an entire hour.

    • MangoPenguin@lemmy.blahaj.zone
      link
      fedilink
      English
      arrow-up
      1
      ·
      3 months ago

      60 requests per hour per IP could easily be hit from say, uBlock origin updating filter lists in a household with 5-10 devices.

    • Disregard3145@lemmy.world
      link
      fedilink
      arrow-up
      0
      ·
      3 months ago

      I hit those many times when signed out just scrolling through the code. The front end must be sending off tonnes of background requests

  • daniskarma@lemmy.dbzer0.com
    link
    fedilink
    arrow-up
    0
    ·
    3 months ago

    Open source repositories should rely on p2p. Torrenting repos is the way I think.

    Not only for this. At any point m$ could take down your repo if they or their investors don’t like it.

    I wonder if it would already exist and if it could work with git?

    • Kuinox@lemmy.world
      link
      fedilink
      arrow-up
      0
      ·
      3 months ago

      Torrenting doesn’t deal well with updating files.
      And you have another problem: how do you handle bad actors spamming the download ?
      That’s probably why github does that.

      • daniskarma@lemmy.dbzer0.com
        link
        fedilink
        arrow-up
        0
        ·
        edit-2
        3 months ago

        That’s true. I didn’t think of that.

        IPFS supposedly works fine with updating shares. But I don’t want to get closer to that project as they had fallen into cryptoscam territory.

        I’m currently reading about “radicle” let’s see what the propose.

        I don’t get the bad actors spamming the download. Like downloading too much? Torrent leechers?

        EDIT: Just finished by search sbout radicle. They of course have relations with a cryptomscam. Obviously… ;_; why this keep happening?

        • Jakeroxs@sh.itjust.works
          link
          fedilink
          arrow-up
          0
          arrow-down
          1
          ·
          edit-2
          3 months ago

          There’s literally nothing about crypto in radicle from my reading, cryptography and crypto currency are not synonymous.

          Ah because they also have a different project for a crypto payment platform for funding open source development.

          Edit again: it seems pretty nifty actually, why do you think it’s a scam? Just because crypto?

  • John Richard@lemmy.world
    link
    fedilink
    English
    arrow-up
    0
    ·
    3 months ago

    Crazy how many people think this is okay, yet left Reddit cause of their API shenanigans. GitHub is already halfway to requiring signing in to view anything like Twitter (X).

  • tal@lemmy.today
    link
    fedilink
    English
    arrow-up
    0
    ·
    3 months ago

    60 req/hour for unauthenticated users

    That’s low enough that it may cause problems for a lot of infrastructure. Like, I’m pretty sure that the MELPA emacs package repository builds out of git, and a lot of that is on github.

    • NotSteve_@lemmy.ca
      link
      fedilink
      arrow-up
      0
      ·
      3 months ago

      Do you think any infrastructure is pulling that often while unauthenticated? It seems like an easy fix either way (in my admittedly non devops opinion)

      • Ephera@lemmy.ml
        link
        fedilink
        English
        arrow-up
        0
        ·
        3 months ago

        It’s gonna be problematic in particular for organisations with larger offices. If you’ve got hundreds of devs/sysadmins under the same public IP address, those 60 requests/hour are shared between them.

        Basically, I expect unauthenticated pulls to not anymore be possible at my day job, which means repos hosted on GitHub become a pain.

        • timbuck2themoon@sh.itjust.works
          link
          fedilink
          English
          arrow-up
          0
          ·
          3 months ago

          Quite frankly, companies shouldn’t be pulling Willy nilly from github or npm, etc anyway. It’s trivial to set up something to cache repos or artifacts, etc. Plus it guards against being down when github is down, etc.

          • Ephera@lemmy.ml
            link
            fedilink
            English
            arrow-up
            1
            ·
            3 months ago

            It’s easy to set up a cache, but what’s hard is convincing your devs to use it.

            Mainly because, well, it generally works without configuring the cache in your build pipeline, as you’ll almost always need some solution for accessing the internet anyways.

            But there’s other reasons, too. You need authentication or a VPN for accessing a cache like that. Authentications means you have to deal with credentials, which is a pain. VPN means it’s likely slower than downloading directly from the internet, at least while you’re working from home.

            Well, and it’s also just yet another moving part in your build pipeline. If that cache is ever down or broken or inaccessible from certain build infrastructure, chances are it will get removed from affected build pipelines and those devs are unlikely to come back.


            Having said that, of course, GitHub is promoting caches quite heavily here. This might make it actually worth using for the individual devs.