Technology

59596 readers

3404 users here now

This is a most excellent place for technology news and articles.

Our Rules

Follow the lemmy.world rules.
Only tech related content.
Be excellent to each another!
Mod approved content bots can post up to 10 articles per day.
Threads asking for personal tech support may be deleted.
Politics threads may be removed.
No memes allowed as posts, OK to post as comments.
Only approved bots from the list below, to ask if your bot can be added please contact us.
Check for duplicates before posting, duplicates may be removed

Approved Bots

founded 1 year ago

MODERATORS

448

2 authors say OpenAI 'ingested' their books to train ChatGPT. Now they're suing, and a 'wave' of similar court cases may follow. (www.businessinsider.com)

submitted 1 year ago by L4s@lemmy.world to c/technology@lemmy.world

115 comments fedilink hide all child comments

Two authors sued OpenAI, accusing the company of violating copyright law. They say OpenAI used their work to train ChatGPT without their consent.

you are viewing a single comment's thread
view the rest of the comments

[–] dhork@lemmy.world 8 points 1 year ago* (last edited 1 year ago) (1 children)

You can write that blog post because you are a human, and your summary qualifies for copyright protection, because it is the unique output of a human based on reading the copywrited material.

But the US authorities are quite clear that a work that is purely AI generated can never qualify for copyright protection. Yet since it is based on the synthesis of works under copyright, it can't really be considered public domain either. Otherwise you could ask the AI "Write me a summary of this book that has exactly the same number of words", and likely get a direct copy of the book which is clear of copyright.

I think that these AI companies are going to face a reckoning, when it is ruled that they misappropriated all this content that they didn't explicitly license for use, and all their output is just fringing by definition.

[–] Whimsical@lemmy.world 3 points 1 year ago

I'm expecting a much messier "resolution" that'll look a lot like YouTube's copyright situation - their product can be used for copyright infringement, and they'll be required by law to try and take appropriate measures to prevent it, but will otherwise not be held liable as long as they can claim such measures are being taken.

Having an AI recite a long text to bypass copyright seems equivalent in my mind to uploading a full movie to youtube. In both cases, some amount of moderation (itself increasingly algorithmic) is required to not only be applied, but actively developed and advanced to flout efforts to bypass it. For instance, youtube pirates will upload things with some superficial changes like a filter applied or showing the movie on a weird angle or mirrored to bypass copyright bots, which means the bots need to be more strict and better trained, or else youtube once again becomes liable for knowing about these pirates and not stopping them.

The end result, just like with youtube, will probably be that AI models have to have big, clunky algorithms applied against their outputs to recalculate or otherwise make copyright-safe anything that might remotely be an infringement. It'll suck for normal users, pirates will still dig for ways to bypass it, and everyone will be unhappy. If youtube is any indicator, this situation can somehow remain stable for over a decade - long enough for AI devs to release a new-generation bot to restart the whole issue.

Yaaaaaaaaay