this post was submitted on 18 Jun 2025
863 points (98.8% liked)

Fediverse

34611 readers
1395 users here now

A community to talk about the Fediverse and all it's related services using ActivityPub (Mastodon, Lemmy, KBin, etc).

If you wanted to get help with moderating your own community then head over to !moderators@lemmy.world!

Rules

Learn more at these websites: Join The Fediverse Wiki, Fediverse.info, Wikipedia Page, The Federation Info (Stats), FediDB (Stats), Sub Rehab (Reddit Migration)

founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
[–] umbraroze@slrpnk.net 7 points 23 hours ago (1 children)

The way copyright law works, by default you don't have any right to make use of anything, even if it's posted publicly. Why do people allow Fediverse platforms to do the thing they do? Leniency on their part.

Gathering data from Mastodon for AI training is technically feasible, but that doesn't mean it's legally justified. Many people will object to that. Many already do!

[–] drmoose@lemmy.world 3 points 23 hours ago (2 children)

No that's not how copyright works. Copyright prohibits distribution not copying.

[–] umbraroze@slrpnk.net 3 points 23 hours ago (1 children)

Er, yes, my point was copyright very much concerns what you're allowed to do with data. But that goes beyond distribution. Derivative works are a complicated topic.

My point stands, whether you technically can copy stuff has no bearing on whether you're allowed to use it and for what purpose.

[–] drmoose@lemmy.world 3 points 22 hours ago (1 children)

Well it depends on the use. If its a movie that I copied then I can watch it, if it's a picture I can print it and put it on a wall at my home. Even AI training currently its considered to be entirely legal to train on copyrighted data. You can even parse copyrighted data for analytics which is entirely legal as well.

So you can do a lot with copyrighted data without breaching the copyright, including AI training as it's the article topic.

[–] umbraroze@slrpnk.net 2 points 22 hours ago (1 children)

Private use of the copyrighted works is pretty much a separate topic entirely.

And while the law isn't settled on the topic, it's wrong to argue AI training is something that happens entirely in a private setting, especially when that work is made available publicly in some form or another.

Sure, there's a problem with the current copyright laws that has to be addressed. It's quite similar to the "TiVo loophole" in OSS licenses. It was addressed, and certainly not in favour of the loophole exploiters. That one could be fixed on licence level because it was ultimately a licence question, but the AI training question, however, needs to be taken to the legislation level. Internationally, too.

[–] drmoose@lemmy.world 3 points 19 hours ago

I don't think this precedence will ever get set because we don't have universal global IP protections. The west will never set it due to fear of China winning the AI race.

In their opinion (which I agree with) this is the greater good and someone's mastodon posts or similar being fed to AI training machine is a lesser evil compared to losing technological advantage to the biggest authoritarian state in the world.

[–] maxwellfire@lemmy.world 1 points 23 hours ago* (last edited 23 hours ago) (1 children)

I don't think this is true. While copying might fall under fair use if used for some purpose, you definitely can get in trouble for copying even without distributing those copies.

For example, you can't rent a library book and then photocopy the whole thing for yourself

[–] drmoose@lemmy.world 3 points 23 hours ago* (last edited 23 hours ago) (1 children)

Those are entirely different laws you're thinking about like DMCA, EUCA, database protection laws (yeah lol it's a real thing) etc. Copyright on its own is about distribution.

That being said data law is really complex and more often than not turns to damage proof rather than explicit protections. Basically its all lawyer speak rather than an actual idealistic framework that aims to protect someone. This is primary argument why copyright is a failed framework because it's always just a battle of lawyers and damages.

[–] maxwellfire@lemmy.world 1 points 23 hours ago* (last edited 22 hours ago)

I still don't think this is correct for two reasons. 1: I believe the DMCA and friends count as copyright law. 2: just reading the text of the law (#17 U.S. Code § 106):

Subject to sections 107 through 122, the owner of copyright under this title has the exclusive rights to do and to authorize any of the following:

(1) to reproduce the copyrighted work in copies or phonorecords;

(2) to prepare derivative works based upon the copyrighted work;

(3) to distribute copies or phonorecords of the copyrighted work to the public by sale or other transfer of ownership, or by rental, lease, or lending;

(4) in the case of literary, musical, dramatic, and choreographic works, pantomimes, and motion pictures and other audiovisual works, to perform the copyrighted work publicly;

(5) in the case of literary, musical, dramatic, and choreographic works, pantomimes, and pictorial, graphic, or sculptural works, including the individual images of a motion picture or other audiovisual work, to display the copyrighted work publicly; and

(6) in the case of sound recordings, to perform the copyrighted work publicly by means of a digital audio transmission

It seems pretty clear that only the copyright owner has the rights to make copies, subject to a number of exemption.

Now IANAL so I could be missing something pretty huge, but my understanding was that this right to make copies (especially physical ones for physical media) is at the core of copyright law. Not just the distribution of those copies (which is captured by right 3)