Bite each others dicks off
Technology
This is a most excellent place for technology news and articles.
Our Rules
- Follow the lemmy.world rules.
- Only tech related news or articles.
- Be excellent to each other!
- Mod approved content bots can post up to 10 articles per day.
- Threads asking for personal tech support may be deleted.
- Politics threads may be removed.
- No memes allowed as posts, OK to post as comments.
- Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
- Check for duplicates before posting, duplicates may be removed
- Accounts 7 days and younger will have their posts automatically removed.
Approved Bots
Note that Disney and Universal pirate other people's stuff whenever they want.
Note also that all the Generative AI services are very protective of their big cistern of web-crawled data, say when China borrows it for DeepSeek.
Content, content everywhere and not a drop of principle.
A copy is not theft.
Intellectual property is thought monopoly. See Disco Elysium for a particularly sad case of it.
Do you mean play disco Elysium or is there some drama associated with it?
Drama. A business partner of the creators used an illegal loophole to obtain a majority stake of the company and then fired the actual creators because they where considered to volatile.
The universe of Disco Elysium is Kurvitz paracosm which he has been creating since his teens. Its a part of their identity that they are now barred from expressing.
Its a bit like if you told Tolkien halfway trough writing lotr that he is fired as the author and can never write anything about middle earth again.
You should totally play the game, but make sure that you pirate it so your money doesn't go to the thief who stole the rights from the creators.
Oh that's unfortunate. Well I don't mind not supporting people like that so I'll give it a go
Would it not then be better to buy a shady key to financially hurt the company?
You mean cause a chargeback or something? You'd have to find a sufficiently shady seller, the key might get revoked, also you're supporting another ilk of scumbags.
It's not actually a very fun game to play, reading the lore or watching a video of someone else play is sufficient.
Disagree, I think being in the pilot seat is important. The immersion of control amplifies the experience.
But it would be a copyright infringement.
Oh so when Big companies do it, it's OK. But it's stealing when an OpenSource AI gives that same power back to the people.
Midjourney isn't opensource, I can't run it on my PC, contrary to stable diffusion.
That's part of the strategy. First, go after the small project that can't defend itself. Use that to set a precedent that is harder for the bigger targets to overturn.
I would expect the bigger players to get themselves involved in the defense for exactly that reason.
I say this as a massive AI critic: Disney does not have a legitimate grievance here.
AI training data is scraping. Scraping is — and must continue to be — fair use. As Cory Doctorow (fellow AI critic) says: Scraping against the wishes of the scraped is good, actually.
I want generative AI firms to get taken down. But I want them to be taken down for the right reasons.
Their products are toxic to communication and collaboration.
They are the embodiment of a pathology that sees humanity — what they might call inefficiency, disagreement, incoherence, emotionality, bias, chaos, disobedience — as a problem, and technology as the answer.
Dismantle them on the basis of what their poison does to public discourse, shared knowledge, connection to each other, mental well-being, fair competition, privacy, labor dignity, and personal identity.
Not because they didn’t pay the fucking Mickey Mouse toll.
You did not read your source. Some quotes you apparently missed:
Scraping to violate the public’s privacy is bad, actually.
Scraping to alienate creative workers’ labor is bad, actually.
Please read your source before posting it and claiming it says something it doesn't actually say.
Now why does Doctrow distinguish between good scraping and bad scraping, and even between good LLM training and bad LLM training in his post?
Because the good applications are actually covered by fair use while the bad parts aren't.
Because fair use isn't actually about what is done (scraping, LLM training, ...) but about who does it (researchers, non-profit vs. companies, for-profit) and for what purpose (research, critique, teaching, news reporting vs. making a profit by putting original copyright owners out of work).
That's the whole point of fair use. It's even in the name. It's about the use, and the use needs to be fair. It's not called "Allowed techniques, don't care if it's fair".
Are you saying that the mere action of scraping is fair use, or that absolutely anything you do with the data you scrape is also fair use?
The Doctorow article does not say "scraping is good actually" - it says "scraping in X circumstance is good" and "scraping in Y circumstance is bad", and wraps up by admitting the obvious and glaring contradiction.
I'd say that scraping as a verb implies an element of intent. It's about compiling information about a body of work, not simply making a copy, and therefore if you can accurately call it "scraping" then it's always fair use. (Accuse me of "No True Scotsman" if you would like.)
But since it involves making a copy (even if only a temporary one) of licensed material, there's the potential that you're doing one thing with that copy which is fair use, and another thing with the copy that isn't fair use.
Take archive.org for example:
It doesn't only contain information about the work, but also a copy (or copies, plural) of the work itself. You could argue (and many have) that archive.org only claims to be about preserving an accurate history of a piece of content, but functionally mostly serves as a way to distribute unlicensed copies of that content.
I don't personally think that's a justified accusation, because I think they do everything in their power to be as fair as possible, and there's a massive public benefit to having a service like this. But it does illustrate how you could easily have a scenario where the stated purpose is fair use but the actual implementation is not, and the infringing material was "scraped" in the first place.
But in the case of gen AI, I think it's pretty clear that the residual data from the source content is much closer to a linguistic analysis than to an internet archive. So it's firmly in the fair use category, in my opinion.
Edit: And to be clear, when I say it's fair use, I only mean in the strict sense of following copyright law. I don't mean that it is (or should be) clear of all other legal considerations.
"if you can accurately call it "scraping" then it's always fair use."
I think you make some compelling points overall, but fair use has always been more complex than this. The intent is taken into account when evaluating whether something is fair use, but so is the actual impact — "fair use" is a designation applied to the overall situation, not to any singular factors (so a stated purpose can't be fair use)
I think the distinction between data acquisition and data application is important. Consider the parallel of photography; you are legally and ethically entitled to take a photo of anything that you can see from public (ie, you can "scrape" it). But that doesn't mean that you can do anything you want with those photos. Distinguishing them makes the scraping part a lot less muddy.
oooooo do openAI next!
requests that Midjourney be made to pay up for the damage it has caused the two companies.
good luck proving and putting an accurate number to that perceived 'damage'?
Easy, ten gajillion dollars. Payable in stock.
No problem, how much is "everything" in USD?
Removes sunglasses ..."Let them fight"
The enemies of my enemies are my friends.
But if both sides are your enemies, they're both your friends. But if they're your friends, they aren't the enemies of your enemies anymore, which would make them your enemies once again. But then they are your friends again. But then
But if both sides are your enemies, they're both your friends.
Yes. And both of my friends will weaken both of my enemies.
The worst person you know just made a great point
I dunno were I stand on this one. I can see Disneys argument and agree with it on first glance, but at the same time, is the artists doing fan art infringing copyright then?
Artists doing fan art are infringing copyright, yes. If the fan art meets the fair use criteria then they are not Infringing.
Companies usually overlook the infringement from fan artists because it's free advertising and the public backlash is not worth going after lone artists. They usually will go after fan art of people that profit off it.
Stupid lawsuit because anyone can do Ai now.
that's a shit take.
anyone can do AI now, but everyone can't profit from it like they can. that's why the lawsuit.
Remember when stealing on sea was piracy? Always has been.
Copyright infringement is different.
Yes. Piracy in the sense of stealing from ships in international waters is different from piracy in the sense of copyright infringement. Thanks for that.
I didn't mean to suggest that. I consider calling copyright infringement "piracy" to be propaganda started by the music industry to push their monetary interests. A derogatory term that conflates it with immoral stealing (and murder). This overstates any harms caused.
The biggest pirates in the world should know.