this post was submitted on 31 Aug 2023
592 points (97.9% liked)

Technology

59223 readers
3444 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related content.
  3. Be excellent to each another!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, to ask if your bot can be added please contact us.
  9. Check for duplicates before posting, duplicates may be removed

Approved Bots


founded 1 year ago
MODERATORS
 

I'm rather curious to see how the EU's privacy laws are going to handle this.

(Original article is from Fortune, but Yahoo Finance doesn't have a paywall)

you are viewing a single comment's thread
view the rest of the comments
[–] Veraticus@lib.lgbt 14 points 1 year ago (4 children)

Because it doesn’t “know” those things in the same way people know things.

[–] hansl@lemmy.ml 23 points 1 year ago (1 children)

It’s closer to how you (as a person) know things than, say, how a database know things.

I still remember my childhood home phone number. You could ask me to forget it a million times I wouldn’t be able to. It’s useless information today. I just can’t stop remembering it.

[–] Veraticus@lib.lgbt -4 points 1 year ago* (last edited 1 year ago) (4 children)

No, you knowing your old phone number is closer to how a database knows things than how LLMs know things.

LLMs don't "know" information. They don't retain an individual fact, or know that something is true and something else is false (or that anything "is" at all). Everything they say is generated based on the likelihood of a word following another word based on the context that word is placed in.

You can't ask it to "forget" a piece of information because there's no "childhood phone number" in its memory. Instead there's an increased likelihood it will say your phone number as the result of someone prompting it to tell it a phone number. It doesn't "know" the information at all, it simply has become a part of the weights it uses to generate phrases.

[–] Zeth0s@lemmy.world 15 points 1 year ago* (last edited 1 year ago) (1 children)

It's the same in your brain though. There is no number in your brain. Just a set of synapses that allows a depolarization wave to propagate across neurons, via neurotransmitters released and absorbed in a narrow space.

The way the brain is built allows you to "remember" stuff, reconstruct information incompletely stored as different, unique connections in a network. But it is not "certain", we can't know if it's the absolute truth. That's why we need password databases and phone books, because our memory is not a database. It is probably worse than gpt-4

[–] SpiderShoeCult@sopuli.xyz 6 points 1 year ago (1 children)

Genuinely curious how you would describe humans remembering stuff, because if I remember correctly my biology classes, it's about reinforced neural pathways that become more likely to be taken by an electrical impulse than those that are less 'travelled'. The whole notion of neural networks is right there in the name, based on how neurons work.

[–] MarcoPogo@lemmy.world 5 points 1 year ago (1 children)

Are we sure that this is substantially different from how our brain remembers things? We also remember by association

[–] Veraticus@lib.lgbt -4 points 1 year ago

But our memories exist -- I can say definitively "I know my childhood phone number." It might be meaningless, but the information is stored in my head. I know it.

AI models don't know your childhood phone number, even if you tell them explicitly, even if they trained on it. Your childhood phone number becomes part of a model of word weights that makes it slightly more likely, when someone asks it for a phone number, that some digits of your childhood phone number might appear (or perhaps the entire thing!).

But the original information is lost.

You can't ask it to "forget" the phone number because it doesn't know it and never knew it. Even if it supplies literally your exact phone number, it isn't because it knew your phone number or because that information is correct. It's because that sequence of numbers is, based on its model, very likely to occur in that order.

[–] theneverfox@pawb.social 1 points 1 year ago

This isn't true at all - first, we don't know things like a database knows things.

Second, they do retain individual facts in the same sort of way we know things, through relationships. The difference is, for us the Eiffel tower is a concept, and the name, appearance, and everything else about it are relationships - we can forget the name of something but remember everything else about it. They're word based, so the name is everything for them - they can't learn facts about a building then later learn the name of it and retain the facts, but they could later learn additional names for it

For example, they did experiments using some visualization tools and edited it manually. They changed the link been Eiffel tower and Paris to Rome, and the model began to believe it was in Rome. You could then ask what you'd see from the Eiffel tower, and it'd start listing landmarks like the coliseum

So you absolutely could have it erase facts - you just have to delete relationships or scramble details. It just might have unintended side effects, and no tools currently exist to do this in an automated fashion

For humans, it's much harder - our minds use layers of abstraction and aren't a unified set of info. That mean you could zap knowledge of the Eiffel tower, and we might forget about it. But then thinking about Paris, we might remember it and rebuild certain facts about it, then thinking about world fairs we might remember when it was built and by who, etc

[–] dustyData@lemmy.world 12 points 1 year ago

Not only it doesn't know, but for the people who trained them it is very hard to know whether some piece of information is or isn't inside the model. Introspection about how exactly the model ends up making decisions after it has been trained is incredibly difficult.

[–] SatanicNotMessianic@lemmy.ml 10 points 1 year ago (1 children)

It’s actually because they do know things in a way that’s analogous to how people know things.

Let’s say you wanted to forget that cats exist. You’d have to forget every cat meme you’ve ever seen, of course, but your entire knowledge of memes would also have to change. You’d have to forget that you knew how a huge part of the trend started with “i can haz cheeseburger.”

You’d have to forget that you owned a cat, which will change your entire memory of your life history about adopting the cat, getting home in time to feed it, and how it interacted with your other animals or family. Almost every aspect of your life is affected when you own an animal, and all of those would have to somehow be remembered in a no-cat context. Depending on how broadly we define “cat,” you might even need to radically change your understanding of African ecosystems, the history of sailing, evolutionary biology, and so on. Your understanding of mice and rats would have to change. Your understanding of dogs would have to change. Your memory of cartoons would have to change - can you even remember Jerry without Tom? Those are just off the top of my head at 8 in the morning. The ramifications would be huge.

Concepts are all interconnected, and that’s how this class of AI works. I’ve owned cars most of my life, so it’s a huge part of my personal memory and self-definition. They’re also ubiquitous in culture. Hundreds of thousands to millions of concepts relate to cats in some way, and each one of them would need to change, as would each concept that relates to those concepts. Pretty much everything is connected to everything else and as new data are added, they’re added in such a way that they relate to virtually everything that’s already there. Removing cats might not seem to change your knowledge of quarks, but there’s some very very small linkage between the two.

Smaller impact memories are also difficult. That guy with the weird mustache you saw during your vacation to Madrid ten years ago probably doesn’t have that much of a cascading effect, but because Esteban (you never knew his name) has such a tiny impact, it’s also very difficult to detect and remove. His removal won’t affect much of anything in terms of your memory or recall, but if you’re suddenly legally obligated to demonstrate you’ve successfully removed him from your memory, it will be tough.

Basically, the laws were written at a time when people were records in a database and each had their own row. Forgetting a person just meant deleting that row. That’s not the case with these systems.

The thing is that we don’t compel researchers to re-train their models on a data set if someone requests their removal. If you have traditional research on obesity, for instance, and you have a regression model that’s looking at various contributing factors, you do not have to start all over again if someone requests their data be deleted. It should mean that the person’s data are removed from your data set it it doesn’t mean that you can’t continue to use that model - at least it never has, to my knowledge. Your right to be forgotten doesn’t translate to you being allowed to invalidate the scientific models generated that glom together your data with that of tens of thousands of others. You can be left out of the next round of research on that dataset, but I have never heard of people being legally compelled to regenerate a model based on that.

There are absolutely novel legal questions that are going to be involved here, but I just wanted to clarify that it’s really not a simple answer from any perspective.

[–] Zeth0s@lemmy.world 5 points 1 year ago (1 children)

Actually it is also impossible to ask people to forget. This is something we share with AI

[–] Veraticus@lib.lgbt -1 points 1 year ago (1 children)

Yes, but only by chance.

Human brains can't forget because human brains don't operate that way. LLMs can't forget because they don't know information to begin with, at least not in the same sense that humans do.

[–] Zeth0s@lemmy.world 1 points 1 year ago

See my other reply ;)