Maven, a new social network backed by OpenAI’s Sam Altman, found itself in a controversy today when it imported a huge amount of posts and profiles from the Fediverse, and then ran AI analysis to alter the content.

  • misk@sopuli.xyz
    link
    fedilink
    English
    arrow-up
    21
    arrow-down
    1
    ·
    edit-2
    5 months ago

    That’s why I keep saying it’s pointless to defederate corpos. They’ll just scrape everything before you notice.

    • Blaze@reddthat.com
      link
      fedilink
      English
      arrow-up
      9
      ·
      5 months ago

      Defederation is more about not being flooded with 1000x more users than the Fediverse currently has

      • IronKrill@lemmy.ca
        link
        fedilink
        English
        arrow-up
        1
        ·
        5 months ago

        Unfortunately a lot of people think it’s to do with scraping as well. The amount of “defederate Threads so that they can’t scrape my data” posts I saw was about 50-50 with the sensible takes.

      • misk@sopuli.xyz
        link
        fedilink
        English
        arrow-up
        1
        ·
        5 months ago

        So far we only have a corpo fedi-twitter in form of Threads. In that case non-corpo instance user has to specifically follow someone before their content is federated so that sounds like a bit overblown issue.

        • Blaze@reddthat.com
          link
          fedilink
          English
          arrow-up
          1
          ·
          5 months ago

          Seems pretty easy for any corporation to setup something like https://lemmy-federate.com/ but for Maston/IceShrimp/Misskey accounts to federate the important corporate accounts to the targeted non-corpo instances

          • misk@sopuli.xyz
            link
            fedilink
            English
            arrow-up
            1
            ·
            5 months ago

            There’s no real harm in that unless they spam, at which point those accounts can be banned which shouldn’t overwhelm moderators.

  • verstra@programming.dev
    link
    fedilink
    English
    arrow-up
    12
    arrow-down
    2
    ·
    5 months ago

    Oh shit, the persona guy was right! We should all be adding license to our comments, so could not legally train model that are then used for commercial purposes.

      • onlinepersona@programming.dev
        link
        fedilink
        English
        arrow-up
        4
        arrow-down
        3
        ·
        5 months ago

        Why do you think it won’t hold water legally? There’s a case going right now against Github Copilot for scraping GPL licences code, even spitting it back out verbatim, and not making “open” AI actually open.

        Creative Commons is not a joke licence. It actually is used by artists, authors, and other creative types.

        Imagine Maven or another company doing the same shit they just did and it coming to light there were a bunch of noncommercially licences content in there. The authors could band together for a class action lawsuit and sue their asses. Given the reaction of users here and on mastodon, I wouldn’t even be surprised if it did happen.

        Anti Commercial-AI license

          • Venia Silente@lemm.ee
            link
            fedilink
            English
            arrow-up
            2
            arrow-down
            1
            ·
            5 months ago

            Don’t we also need a critical mass of people adding licenses to posts? So that a class action suit can be launched. Because it would be inviable and a very rapid path to self-defeat if people started to try and individually sue big corpo.

            Also I’m missing a way to automatically add this to my posts. Something like a browser extension.

            This post is licensed under CC BY-NC-SA 4.0.

              • Venia Silente@lemm.ee
                link
                fedilink
                English
                arrow-up
                2
                arrow-down
                1
                ·
                5 months ago

                Also for me I’m using a text expander so that after I type a shortcut it automatically adds the rest of the text for me.

                I request of you, show me your ways!

                • Danterious@lemmy.dbzer0.com
                  link
                  fedilink
                  English
                  arrow-up
                  2
                  arrow-down
                  1
                  ·
                  5 months ago

                  Well on firefox/chrome extensions you can search for text expander and choose an extension that works for you.

                  Or if you are using a phone you can do the same on the app store and I think there should be a few options.

                  Once you download one of them it should give instructions on how to use it, but in general it asks you to create a phrase that you want to be automatically triggered and a shorter phrase that automatically replaced with the longer phrase.

                  For example-

                  long phrase: The quick brown fox jumped over the moon.

                  short phrase: /qfox

                  and every time you typed /qfox it would replace it with “The quick brown fox jumped over the moon.”

                  Anti Commercial-AI license (CC BY-NC-SA 4.0)

    • onlinepersona@programming.dev
      link
      fedilink
      English
      arrow-up
      4
      arrow-down
      3
      ·
      5 months ago

      It’s especially for these kinds of dumb cases where they simply copy content wholesale and boast about it. With more people licencing their contents as non commercial, the “hot water” these companies get in could not just be trivial but actually legal.

      Would be great if web and mobile clients supported signatures or a “licence” field from which signatures were generated. Even better would be if people smarter than me added a feature to poison AI training data. This could also be done by a signature or some other method.

      Anti Commercial-AI license

      • TheGalacticVoid@lemm.ee
        link
        fedilink
        English
        arrow-up
        1
        ·
        5 months ago

        I don’t know; AFAIK, Reddit successfully argued that they own Wallstreetbets’ trademarks in court. That might void all of these licenses depending on the ToS of the instance being used.

  • Grandwolf319@sh.itjust.works
    link
    fedilink
    English
    arrow-up
    5
    ·
    5 months ago

    Genuine question, do instances not have a GPL license on their content? With that license, anyone can use all the data but only for open source software.

    • GamingChairModel@lemmy.world
      link
      fedilink
      English
      arrow-up
      2
      ·
      5 months ago

      Instances don’t actually own the copyright to comments. The poster owns the copyright and licenses it to the instance. Which lets the instance use it, but not sublicense to others.

    • jackalope@lemmy.ml
      link
      fedilink
      English
      arrow-up
      1
      ·
      5 months ago

      I don’t think you can use gpl for anything but code. Creative commons license would be more appropriate.

  • Flax@feddit.uk
    link
    fedilink
    English
    arrow-up
    1
    ·
    5 months ago

    Does Maven have anything to do with AI despite being backed by a dude who works for open AI?

    • Sean Tilley@lemmy.worldOP
      link
      fedilink
      English
      arrow-up
      1
      ·
      5 months ago

      Yes, the entire platform trains itself on posts within its platform to make algorithmic decisions and present it to users. Instead of likes or follows, you just have that.

      • Flax@feddit.uk
        link
        fedilink
        English
        arrow-up
        1
        ·
        5 months ago

        But it doesn’t actually produce content that’s AI generated by an LLM model?