- cross-posted to:
- technology@lemmy.zip
- cross-posted to:
- technology@lemmy.zip
My original, editorialized title: Ars Technica Sells Out
Linking to this because I know people here read Ars Technica, and I totally didn’t become a subscriber three days before this was announced. Nope. No sir.
Damnit! I still like and respect the Ars Technica staff but Condé Nast can piss off.
I feel for you KingThrillgore. I was thinking of supporting the site with a subscription but not after this. Still if enough people stop subscribing we may loose them altogether. This is a double edged shit sword.
I made it clear in my comment I was not happy about this after I became a subscriber. It will not auto-renew. If I had done it with a credit card and not Paypal, I’d try for a chargeback.
For what its worth, nobody else active on the site is happy about this, either. Lots of unsubscribes are being claimed in the comments (including mine).
Understood and have seen the comments. I don’t blame you for being upset. It’s a crappy situation.
Fuck those shitheads.
what’s the problem here? openai isn’t pirating content if they pay for it,? have i misunderstood something
ai new
new bad
remember old time
old time good
The main problem now is that Ars Technica and all other Conde Nast publications, with it now having a vested interest in openAI (they’re getting paid by them), can no longer be reliably trusted to report on any AI or AI-adjacent topic whatsoever. And every user comment and content is now owned by openAI.
I bet MFCBot would work super well if it was updated to report on if the publication’s owners had a vested interest in the topic.
Oh it would. It would require the MFC site itself to actually collect and collate data on site’s parent company investment portfolios and that’s a pretty massive ask from what sounds like a very tiny team.
I wonder if it could somehow pull that data from somewhere like ground news somehow
llms suck because they steal content and are unreliable since they don’t link back to sources
Open ai makes a deal to pay media org for there content and makes it so they can link back to original article
“Ars technica sold out”
Condé Nast didn’t just sell access to their subsidiaries’ content, but also to the user generated content on those subsidiaries’ sites. That’s at issue here.
It also has a possibility to cause a conflict of interest for Ars Technica to write about OpenAI. That’s the second issue here.
And, as per the editor in chief, the money doesn’t go to Ars Technica, but to Condé Nast.
Yes they sold access to the user content we’ve generated after we explicitly agreed to the fact that they may do so. If you’ve chosen to not read the fine print when you created an account and created content for them, that’s sort of up to you tbh.
I am going to stop reading them myself.
It’s not so much ideological than practical, I simply don’t have the time to fact check them, or figure out which are the real articles and which are the AI ones, etc etc
I understand them. If they refused for integrity reasons, openai would steal their content anyways via scrapers.
Suing them for copyright infringement, even if is the desire we all have, is ultra expensive.
I would also have signed that deal with the devil…
Either you sell your soul for something, or get it taken from you, leaving you with nothing.
Never trust Condé Nast to do the right by its consumers. That’s a tale decades old at this point.
I want Ars content to be part of whatever training data is provided to the best models. How does that get done without appearing like they are being bought?
Even if their contract explicitly states that it is a data sharing agreement only and the products of the media organization (articles/investigations) are not grounds for breach or retaliation, it is assumed that there is now some impartiality in future reporting.
So, for all media companies, the options seem to be:
- Contribute to the greater good by openly permitting site scraping (for $0)
- Allow data sharing to contracted parties only (for a fee)
- Public or privately prohibit use of any data, and then seek damages down the road for theft/copyright infringement when the legal framework has been established.
Is there a GPL or other license structure that permits data sharing for LLM training in a way that it does not get transformed into something evil?
This is the logical endpoint for all the people who were complaining that scraping the open web for training is somehow immoral/illegal. Instead of stopping AI those with deep pockets will continue to train on everything while open source and small company efforts will be locked out.
Useful AI will be focused and narrow unless they actually achieve AGI.
Scraping literally the whole internet for inspiration is part of the reason they come up with utter rubbish. No one’s actually scrutinizing what their ingesting. It’s not so much a problem that they violate copyright it’s more an issue that because they do it in this manner their output is garbage.
If these AI companies actually did some content curation we might get decent AI out of it.