this post was submitted on 05 Feb 2025
299 points (82.4% liked)

Technology

61778 readers
4041 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related content.
  3. Be excellent to each other!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, to ask if your bot can be added please contact us.
  9. Check for duplicates before posting, duplicates may be removed
  10. Accounts 7 days and younger will have their posts automatically removed.

Approved Bots


founded 2 years ago
MODERATORS
 
top 50 comments
sorted by: hot top controversial new old
[–] Zess@lemmy.world 7 points 10 hours ago (1 children)

You asked a stupid question and got a stupid response, seems fine to me.

[–] interdimensionalmeme@lemmy.ml 1 points 10 hours ago

Yes, nobody asking that question is wonderring about the "straw" part of the word. They're asking, is the "berry" part one, or two "r"s

[–] Grabthar@lemmy.world 12 points 14 hours ago

Doc: That’s an interesting name, Mr…

Fletch: Babar.

Doc: Is that with one B or two?

Fletch: One. B-A-B-A-R.

Doc: That’s two.

Fletch: Yeah, but not right next to each other, that’s what I thought you meant.

Doc: Isn’t there a children’s book about an elephant named Babar.

Fletch: Ha, ha, ha. I wouldn’t know. I don’t have any.

Doc: No children?

Fletch: No elephant books.

[–] humorlessrepost@lemmy.world 7 points 13 hours ago* (last edited 13 hours ago) (1 children)

Works fine for me in o3-mini-high:

Counting letters in “strawberry”

Alright, I’m checking: the word “strawberry” is spelled S T R A W B E R R Y. Let me count the letters: S (1), T (2), R (3), A (4), W (5), B (6), E (7), R (8), R (9), Y (10). There are three R’s: in positions 3, 8, and 9. So, the answer is 3. Even if we ignore case, the count still holds. Therefore, there are 3 r’s in “strawberry.”

[–] interdimensionalmeme@lemmy.ml 0 points 10 hours ago

A normal person would say 'strawberry with two "r"s'

[–] winkly@lemmy.world 4 points 13 hours ago

How many strawberries could a strawberry bury if a strawberry could bury strawberries 🍓

[–] LovableSidekick@lemmy.world 3 points 14 hours ago (2 children)

What would have been different about this if it had impressed you? It answered the literal question and also the question the user was actually trying to ask.

[–] genuineparts@infosec.pub 1 points 2 hours ago

But you realize that it's wrong on both counts, right?

Strawberry has three Rs or two Rs in the wrong spelling.

[–] Wrrzag@lemmy.ml 10 points 13 hours ago (1 children)

It didn't? StRawbeRy has 2 rs. StRawbeRRy has 3.

[–] LovableSidekick@lemmy.world 6 points 13 hours ago (1 children)

OHHHHHHH.... my bad. I'm an idiot. Being an LLM it's giving the answer it thinks a human such as myself would come up with.

[–] SharkAttak@kbin.melroy.org 3 points 13 hours ago (1 children)
[–] LovableSidekick@lemmy.world 2 points 13 hours ago

Not last time I checked, but we all could be as far as you know.

[–] ClusterBomb@lemmy.blahaj.zone 24 points 1 day ago (2 children)

"My hammer is not well suited to cut vegetables" 🤷

There is so much to say about AI, can we move on from "it can't count letters and do math" ?

[–] Strykker@programming.dev 6 points 15 hours ago (2 children)

But the problem is more "my do it all tool randomly fails at arbitrary tasks in an unpredictable fashion" making it hard to trust as a tool in any circumstances.

[–] interdimensionalmeme@lemmy.ml 1 points 10 hours ago

Answer, you're using it wrong /stevejobs

[–] desktop_user@lemmy.blahaj.zone 1 points 14 hours ago (1 children)

it would be like complaining that a water balloon isn't useful because it isn't accurate. LLMs are good at approximating language, numbers are too specific and have more objective answers.

[–] ReallyActuallyFrankenstein@lemmynsfw.com 8 points 23 hours ago (1 children)

I get that it's usually just a dunk on AI, but it is also still a valid demonstration that AI has pretty severe and unpredictable gaps in functionality, in addition to failing to properly indicate confidence (or lack thereof).

People who understand that it's a glorified autocomplete will know how to disregard or prompt around some of these gaps, but this remains a litmus test because it succinctly shows you cannot trust an LLM response even in many "easy" cases.

[–] daniskarma@lemmy.dbzer0.com 21 points 1 day ago* (last edited 1 day ago) (2 children)

That happens when do you not understand what is a llm, or what its usecases are.

This is like not being impressed by a calculator because it cannot give a word synonym.

[–] Strykker@programming.dev 5 points 15 hours ago (2 children)

But everyone selling llms sells them as being able to solve any problem, making it hard to know when it's going to fail and give you junk.

[–] daniskarma@lemmy.dbzer0.com 6 points 14 hours ago

And redbull give you wings.

Marketing within a capitalist market be like that for every product.

[–] NikkiDimes@lemmy.world 1 points 14 hours ago

Is anyone really pitching AI as being able to solve every problem though?

[–] xigoi@lemmy.sdf.org 5 points 1 day ago (1 children)

Sure, maybe it’s not capable of producing the correct answer, which is fine. But it should say “As an LLM, I cannot answer questions like this” instead of just making up an answer.

[–] daniskarma@lemmy.dbzer0.com 6 points 1 day ago (3 children)

I have thought a lot on it. The LLM per se would not know if the question is answerable or not, as it doesn't know if their output is good of bad.

So there's various approach to this issue:

  1. The classic approach, and the one used for censoring: keywords. When the llm gets a certain key word or it can get certain keyword by digesting a text input then give back a hard coded answer. Problem is that while censoring issues are limited. Hard to answer questions are unlimited, hard to hard code all.

  2. Self check answers. For everything question the llm could process it 10 times with different seeds. Then analyze the results and see if they are equivalent. If they are not then just answer that it's unsure about the answer. Problem: multiplication of resource usage. For some questions like the one in the post, it's possible than the multiple randomized answers give equivalent results, so it would still have a decent failure rate.

load more comments (3 replies)
[–] Allero@lemmy.today 12 points 1 day ago* (last edited 1 day ago) (2 children)

Here's my guess, aside from highlighted token issues:

We all know LLMs train on human-generated data. And when we ask something like "how many R's" or "how many L's" is in a given word, we don't mean to count them all - we normally mean something like "how many consecutive letters there are, so I could spell it right".

Yes, the word "strawberry" has 3 R's. But what most people are interested in is whether it is "strawberry" or "strawbery", and their "how many R's" refers to this exactly, not the entire word.

[–] jj4211@lemmy.world 1 points 14 hours ago

It doesn't even see the word 'strawberry', it's been tokenized in a way to no longer see the 'text' that was input.

It's more like it sees a question like: How many 'r's in 草莓?

And it spits out an answer not based on analysis of the input, but a model of what people might have said.

load more comments (1 replies)
[–] gerryflap@feddit.nl 27 points 1 day ago* (last edited 1 day ago) (1 children)

These models don't get single characters but rather tokens repenting multiple characters. While I also don't like the "AI" hype, this image is also very 1 dimensional hate and misreprents the usefulness of these models by picking one adversarial example.

Today ChatGPT saved me a fuckton of time by linking me to the exact issue on gitlab that discussed the issue I was having (full system freezes using Bottles installed with flatpak on Arch). This was the URL it came up with after explaining the problem and giving it the first error I found in dmesg: https://gitlab.archlinux.org/archlinux/packaging/packages/linux/-/issues/110

This issue is one day old. When I looked this shit up myself I found exactly nothing useful on both DDG or Google. After this ChatGPT also provided me with the information that the LTS kernel exists and how to install it. Obviously I verified that stuff before using it, because these LLMs have their limits. Now my system works again, and figuring this out myself would've cost me hours because I had no idea what broke. Was it flatpak, Nvidia, the kernel, Wayland, Bottles, some random shit I changed in a config file 2 years ago? Well thanks to ChatGPT I know.

They're tools, and they can provide new insights that can be very useful. Just don't expect them to always tell the truth, or to actually be human-like

[–] lennivelkant@discuss.tchncs.de 6 points 1 day ago (1 children)

Just don't expect them to always tell the truth, or to actually be human-like

I think the point of the post is to call out exactly that: people preaching AI as replacing humans

[–] desktop_user@lemmy.blahaj.zone 0 points 14 hours ago

it can, in the same way a loom did, just for more language-y tasks, a multimodal system might be better at answering that type of question by first detecting that this is a question of fact and that using a bucket sort algorithm on the word "strawberry" will answer the question better than it's questionably obtained correlations.

[–] zipzoopaboop@lemmynsfw.com 9 points 1 day ago

I asked Gemini if the quest has an SD slot. It doesn't, but Gemini said it did. Checking the source it was pulling info from the vive user manual

[–] eggymachus@sh.itjust.works 36 points 1 day ago

A guy is driving around the back woods of Montana and he sees a sign in front of a broken down shanty-style house: 'Talking Dog For Sale.'

He rings the bell and the owner appears and tells him the dog is in the backyard.

The guy goes into the backyard and sees a nice looking Labrador Retriever sitting there.

"You talk?" he asks.

"Yep" the Lab replies.

After the guy recovers from the shock of hearing a dog talk, he says, "So, what's your story?"

The Lab looks up and says, "Well, I discovered that I could talk when I was pretty young. I wanted to help the government, so I told the CIA. In no time at all they had me jetting from country to country, sitting in rooms with spies and world leaders, because no one figured a dog would be eavesdropping, I was one of their most valuable spies for eight years running... but the jetting around really tired me out, and I knew I wasn't getting any younger so I decided to settle down. I signed up for a job at the airport to do some undercover security, wandering near suspicious characters and listening in. I uncovered some incredible dealings and was awarded a batch of medals. I got married, had a mess of puppies, and now I'm just retired."

The guy is amazed. He goes back in and asks the owner what he wants for the dog.

"Ten dollars" the guy says.

"Ten dollars? This dog is amazing! Why on Earth are you selling him so cheap?"

"Because he's a liar. He's never been out of the yard."

[–] whotookkarl@lemmy.world 51 points 1 day ago (3 children)

I've already had more than one conversation where people quote AI as if it were a source, like quoting google as a source. When I showed them how it can sometimes lie and explain it's not a primary source for anything I just get that blank stare like I have two heads.

load more comments (3 replies)
[–] VintageGenious@sh.itjust.works 79 points 1 day ago (48 children)

Because you're using it wrong. It's good for generative text and chains of thought, not symbolic calculations including math or linguistics

load more comments (48 replies)
[–] Grandwolf319@sh.itjust.works 39 points 1 day ago* (last edited 1 day ago) (2 children)

There is an alternative reality out there where LLMs were never marketed as AI and were marketed as random generator.

In that world, tech savvy people would embrace this tech instead of having to constantly educate people that it is in fact not intelligence.

[–] daniskarma@lemmy.dbzer0.com 4 points 1 day ago

They are not random per se. They are just statistical with just some degree of randomization.

load more comments (1 replies)
[–] AA5B@lemmy.world 2 points 1 day ago

I’ve been avoiding this question up until now, but here goes:

Hey Siri …

  • how many r’s in strawberry? 0
  • how many letter r’s in the word strawberry? 10
  • count the letters in strawberry. How many are r’s? ChatGPT …..2
load more comments
view more: next ›