334
submitted 1 year ago by dl007@lemmy.ml to c/technology@lemmy.ml
you are viewing a single comment's thread
view the rest of the comments
[-] ayaya@lemmy.fmhy.ml 8 points 1 year ago* (last edited 1 year ago)

It is impossible for an AI to cite its sources, at least in the current way of doing things. The AI itself doesn't even know where any particular text comes from. Large language models are essentially really complex word predictors, they look at the previous words and then predict the word that comes next.

When it's training it's putting weights on different words and phrases in relation to each other. If one source makes a certain weight go up by 0.0001% and then another does the same, and then a third makes it go down a bit, and so on-- how do you determine which ones affected the outcome? Multiply this over billions if not trillions of words and there's no realistic way to track where any particular text is coming from unless it happens to quote something exactly.

And if it did happen to quote something exactly, which is basically just random chance, the AI wouldn't even be aware it was quoting anything. When it's running it doesn't have access to the data it was trained on, it only has the weights on its "neurons." All it knows are that certain words and phrases either do or don't show up together often.

this post was submitted on 10 Jul 2023
334 points (94.9% liked)

Technology

34785 readers
362 users here now

This is the official technology community of Lemmy.ml for all news related to creation and use of technology, and to facilitate civil, meaningful discussion around it.


Ask in DM before posting product reviews or ads. All such posts otherwise are subject to removal.


Rules:

1: All Lemmy rules apply

2: Do not post low effort posts

3: NEVER post naziped*gore stuff

4: Always post article URLs or their archived version URLs as sources, NOT screenshots. Help the blind users.

5: personal rants of Big Tech CEOs like Elon Musk are unwelcome (does not include posts about their companies affecting wide range of people)

6: no advertisement posts unless verified as legitimate and non-exploitative/non-consumerist

7: crypto related posts, unless essential, are disallowed

founded 5 years ago
MODERATORS