this post was submitted on 08 Aug 2023
1049 points (97.9% liked)

Privacy

32159 readers
614 users here now

A place to discuss privacy and freedom in the digital world.

Privacy has become a very important issue in modern society, with companies and governments constantly abusing their power, more and more people are waking up to the importance of digital privacy.

In this community everyone is welcome to post links and discuss topics related to privacy.

Some Rules

Related communities

much thanks to @gary_host_laptop for the logo design :)

founded 5 years ago
MODERATORS
 

Source: https://front-end.social/@fox/110846484782705013

Text in the screenshot from Grammarly says:

We develop data sets to train our algorithms so that we can improve the services we provide to customers like you. We have devoted significant time and resources to developing methods to ensure that these data sets are anonymized and de-identified.

To develop these data sets, we sample snippets of text at random, disassociate them from a user's account, and then use a variety of different methods to strip the text of identifying information (such as identifiers, contact details, addresses, etc.). Only then do we use the snippets to train our algorithms-and the original text is deleted. In other words, we don't store any text in a manner that can be associated with your account or used to identify you or anyone else.

We currently offer a feature that permits customers to opt out of this use for Grammarly Business teams of 500 users or more. Please let me know if you might be interested in a license of this size, and I'II forward your request to the corresponding team.

you are viewing a single comment's thread
view the rest of the comments
[–] Jaded@lemmy.dbzer0.com 0 points 1 year ago

It depends for what kind of AI and but no, giving sources and building with just volunteer data is just not possible at our current technological level. I'm mostly talking about large llms because that's what's really at stake and they train on huge amounts of data. Like ALL of stack, GitHub, Reddit, etc. Just fine tuning them on a consumer level takes more than 50 000 question and answer pairs, that's just one tiny superficial layer that's added on top.

Grammerly should absolutely add an opt out option to gain consumers trust, but forcing the the whole industry to do so is a disaster.

If individuals can opt out, so will websites to "protect their users". Then we get data hoarding, where stack and GitHub opt out of all open source options but sell it to the only ones that can now afford to build ais, Microsoft and google. it won't include data of certain individuals, the few that opt out, but I'm guessing eventually the opt in will be directly into the terms of service of websites, you opt in or you fuck off.

How does anyone except corporations benefit from this kind of circus. In 10 years, AI will be doing most office work. Google isn't dumb and wants that profit. They and openai have all the data, they can strong arm or buy what they are missing. Restricting and legislating only widens their moat.