this post was submitted on 22 Jun 2023
16 points (100.0% liked)
Technology
37713 readers
475 users here now
A nice place to discuss rumors, happenings, innovations, and challenges in the technology sphere. We also welcome discussions on the intersections of technology and society. If it’s technological news or discussion of technology, it probably belongs here.
Remember the overriding ethos on Beehaw: Be(e) Nice. Each user you encounter here is a person, and should be treated with kindness (even if they’re wrong, or use a Linux distro you don’t like). Personal attacks will not be tolerated.
Subcommunities on Beehaw:
This community's icon was made by Aaron Schneider, under the CC-BY-NC-SA 4.0 license.
founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
LLMs are already relying on web scraping and always have. They are getting data from the entire Internet, do people really think OpenAI is doing individual integrations with every single website throughout the Internet?! Are Google and Bing doing that, too?
It's complete FUD.
There may be some complexity with legality here though. Obviously Google and other search engines already have most of Reddit's content indexed, but there are some legal arguments as to whether they can use the content to create derivative works.
If Reddit opens up its API and specifically allows AI companies to use the content to create LLMs and other AI tools then from a legal point of view they may find this much more preferable to facing potential legal action further down the road.
Reddit could reach the same agreemen without an API, too.