72
submitted 19 hours ago by Templa@beehaw.org to c/technology@beehaw.org
top 10 comments
sorted by: hot top controversial new old
[-] Danterious@lemmy.dbzer0.com 25 points 18 hours ago

That sucks. So much research is being twisted by humanity's greed. I hope that whatever comes after the internet becomes useless is better.

~Anti~ ~Commercial-AI~ ~license~ ~(CC~ ~BY-NC-SA~ ~4.0)~

[-] Zoot@reddthat.com 4 points 6 hours ago

"Humanity is too greedy. My Facebook esque pretend license will definitely keep my safe!" Lol.

[-] tal@lemmy.today 18 points 18 hours ago

wordfreq is not just concerned with formal printed words. It collected more conversational language usage from two sources in particular: Twitter and Reddit.

Now Twitter is gone anyway, its public APIs have shut down,

Reddit also stopped providing public data archives, and now they sell their archives at a price that only OpenAI will pay.

There's still the Fediverse.

I mean, that doesn't solve the LLM pollution problem, but...

[-] Melody@lemmy.one 8 points 17 hours ago

I'm going to be bold enough to say we don't have as wide of an AI/LLM issue on the Fediverse as the other platforms will have.

I'm certain that if someone did collect data from the Fediverse; it would become a hot topic and it might not be enough data anyways as the Fediverse is not mainstream enough normally. So the data and language collected here might skew in a few imaginable ways that one might find undesirable for a general model of word frequencies.

Also the fact that people might not appreciate that data being collected. Let's be real. It's too soon for such a project to begin. The AI TREND MUST DIE as it currently lives and it's corpse must be rotted away completely. Now, in internet time that may not be all that long...a few to several years...the memory of the internet can be short-lived at times. It must, however, fade from the public conscience into some obscurity first.

Once the technology no longer lies in greedy hands again; new development can begin anew.

[-] Danterious@lemmy.dbzer0.com 4 points 17 hours ago

I’m going to be bold enough to say we don’t have as wide of an AI/LLM issue on the Fediverse as the other platforms will have.

Why do you think that? I don't think that there is anything systemic in how the fediverse operates that will stop LLMs polluting the discourse here too. Actually I already think that they are polluting the discourse here.

~Anti~ ~Commercial-AI~ ~license~ ~(CC~ ~BY-NC-SA~ ~4.0)~

[-] Melody@lemmy.one 2 points 16 hours ago

The filtration capabilities available to most users is pretty robust; depending on what you use to interact with the Fediverse. I thinik it would be possible to filter out problematic bots, users and even whole domain sources with the right kind of software.

[-] Danterious@lemmy.dbzer0.com 1 points 16 hours ago

Good point. I have been a lot more active in tailoring my experience here compared to other social media. I wish there was more tools for deciding whether or not you want to block someone though. Sometimes its not as simple as just looking at their post history. Also as an aside I wish it was possible to block votes as well so the ranking of the content was also able to be personalized.

~Anti~ ~Commercial-AI~ ~license~ ~(CC~ ~BY-NC-SA~ ~4.0)~

[-] Melody@lemmy.one 2 points 16 hours ago

Such a system might be constructed for one's own scraping needs by taking any one of the current frontend/backends and customizing that behavior such that it could mitigate issues or ingest/ignore data based on your own inputs as well; such that your model could be "riding along on a human surfboard with human guidance"

[-] Danterious@lemmy.dbzer0.com 1 points 16 hours ago

such that your model could be “riding along on a human surfboard with human guidance”

Sorry I don't really understand what you're saying here.

~Anti~ ~Commercial-AI~ ~license~ ~(CC~ ~BY-NC-SA~ ~4.0)~

[-] FaceDeer@fedia.io 6 points 17 hours ago

Things change. There was a period before this information was easily available; this repository only goes back to 2013. Now there's a period after this information, too. Things start and eventually they end.

Here's hoping that some neat new things start up in its place.

this post was submitted on 18 Sep 2024
72 points (100.0% liked)

Technology

37601 readers
500 users here now

A nice place to discuss rumors, happenings, innovations, and challenges in the technology sphere. We also welcome discussions on the intersections of technology and society. If it’s technological news or discussion of technology, it probably belongs here.

Remember the overriding ethos on Beehaw: Be(e) Nice. Each user you encounter here is a person, and should be treated with kindness (even if they’re wrong, or use a Linux distro you don’t like). Personal attacks will not be tolerated.

Subcommunities on Beehaw:


This community's icon was made by Aaron Schneider, under the CC-BY-NC-SA 4.0 license.

founded 2 years ago
MODERATORS