Ars Technica content is now available in OpenAI services

kingthrillgore@lemmy.ml · edit-2 2 months ago

Ars Technica content is now available in OpenAI services

Teal@lemm.ee · 2 months ago

Damnit! I still like and respect the Ars Technica staff but Condé Nast can piss off.

I feel for you KingThrillgore. I was thinking of supporting the site with a subscription but not after this. Still if enough people stop subscribing we may loose them altogether. This is a double edged shit sword.

kingthrillgore@lemmy.ml · 2 months ago

I made it clear in my comment I was not happy about this after I became a subscriber. It will not auto-renew. If I had done it with a credit card and not Paypal, I’d try for a chargeback.

For what its worth, nobody else active on the site is happy about this, either. Lots of unsubscribes are being claimed in the comments (including mine).

Teal@lemm.ee · 2 months ago

Understood and have seen the comments. I don’t blame you for being upset. It’s a crappy situation.

zoostation@lemmy.world · 2 months ago

Fuck those shitheads.

kate@lemmy.uhhoh.com · 2 months ago

what’s the problem here? openai isn’t pirating content if they pay for it,? have i misunderstood something

grandkaiser@lemmy.world · edit-2 2 months ago

ai new

new bad

remember old time

old time good

empireOfLove2@lemmy.dbzer0.com · edit-2 2 months ago

The main problem now is that Ars Technica and all other Conde Nast publications, with it now having a vested interest in openAI (they’re getting paid by them), can no longer be reliably trusted to report on any AI or AI-adjacent topic whatsoever. And every user comment and content is now owned by openAI.

fmstrat@lemmy.nowsci.com · 2 months ago

I bet MFCBot would work super well if it was updated to report on if the publication’s owners had a vested interest in the topic.

empireOfLove2@lemmy.dbzer0.com · 2 months ago

Oh it would. It would require the MFC site itself to actually collect and collate data on site’s parent company investment portfolios and that’s a pretty massive ask from what sounds like a very tiny team.

TexMexBazooka@lemm.ee · 2 months ago

I wonder if it could somehow pull that data from somewhere like ground news somehow

Not_mikey@slrpnk.net · edit-2 2 months ago

llms suck because they steal content and are unreliable since they don’t link back to sources

Open ai makes a deal to pay media org for there content and makes it so they can link back to original article

“Ars technica sold out”

madjo@feddit.nl · edit-2 2 months ago

Condé Nast didn’t just sell access to their subsidiaries’ content, but also to the user generated content on those subsidiaries’ sites. That’s at issue here.

It also has a possibility to cause a conflict of interest for Ars Technica to write about OpenAI. That’s the second issue here.

And, as per the editor in chief, the money doesn’t go to Ars Technica, but to Condé Nast.

sunbeam60@lemmy.one · 2 months ago

Yes they sold access to the user content we’ve generated after we explicitly agreed to the fact that they may do so. If you’ve chosen to not read the fine print when you created an account and created content for them, that’s sort of up to you tbh.

Hexbatch@lemmy.world · 2 months ago

I am going to stop reading them myself.

It’s not so much ideological than practical, I simply don’t have the time to fact check them, or figure out which are the real articles and which are the AI ones, etc etc

Wispy2891@lemmy.world · 2 months ago

I understand them. If they refused for integrity reasons, openai would steal their content anyways via scrapers.

Suing them for copyright infringement, even if is the desire we all have, is ultra expensive.

I would also have signed that deal with the devil…

pastermil@sh.itjust.works · 2 months ago

Either you sell your soul for something, or get it taken from you, leaving you with nothing.

👍Maximum Derek👍@discuss.tchncs.de · 2 months ago

Never trust Condé Nast to do the right by its consumers. That’s a tale decades old at this point.

azl@lemmy.sdf.org · 2 months ago

I want Ars content to be part of whatever training data is provided to the best models. How does that get done without appearing like they are being bought?

Even if their contract explicitly states that it is a data sharing agreement only and the products of the media organization (articles/investigations) are not grounds for breach or retaliation, it is assumed that there is now some impartiality in future reporting.

So, for all media companies, the options seem to be:

Contribute to the greater good by openly permitting site scraping (for $0)
Allow data sharing to contracted parties only (for a fee)
Public or privately prohibit use of any data, and then seek damages down the road for theft/copyright infringement when the legal framework has been established.

Is there a GPL or other license structure that permits data sharing for LLM training in a way that it does not get transformed into something evil?

Womble@lemmy.world · 2 months ago

This is the logical endpoint for all the people who were complaining that scraping the open web for training is somehow immoral/illegal. Instead of stopping AI those with deep pockets will continue to train on everything while open source and small company efforts will be locked out.

Echo Dot@feddit.uk · edit-2 2 months ago

Useful AI will be focused and narrow unless they actually achieve AGI.

Scraping literally the whole internet for inspiration is part of the reason they come up with utter rubbish. No one’s actually scrutinizing what their ingesting. It’s not so much a problem that they violate copyright it’s more an issue that because they do it in this manner their output is garbage.

If these AI companies actually did some content curation we might get decent AI out of it.