I’ve been avoiding this question up until now, but here goes:
Hey Siri …
- how many r’s in strawberry? 0
- how many letter r’s in the word strawberry? 10
- count the letters in strawberry. How many are r’s? ChatGPT …..2
This is a most excellent place for technology news and articles.
I’ve been avoiding this question up until now, but here goes:
Hey Siri …
I think I have seen this exact post word for word fifty times in the last year.
And yet they apparently still can't get an accurate result with such a basic query.
Meanwhile... https://futurism.com/openai-signs-deal-us-government-nuclear-weapon-security
It's predictive text on speed. The LLMs currently in vogue hardly qualify as A.I. tbh..
Still, it’s kinda insane how two years ago we didn’t imagine we would be instructing programs like “be helpful but avoid sensitive topics”.
That was definitely a big step in AI.
This is a bad example.. If I ask a friend "is strawberry spelled with one or two r's"they would think I'm asking about the last part of the word.
The question seems to be specifically made to trip up LLMs. I've never heard anyone ask how many of a certain letter is in a word. I've heard people ask how you spell a word and if it's with one or two of a specific letter though.
If you think of LLMs as something with actual intelligence you're going to be very unimpressed.. It's just a model to predict the next word.
If you think of LLMs as something with actual intelligence you're going to be very unimpressed.. It's just a model to predict the next word.
This is exactly the problem, though. They don’t have “intelligence” or any actual reasoning, yet they are constantly being used in situations that require reasoning.
What situations are you thinking of that requires reasoning?
I've used LLMs to create software i needed but couldn't find online.
Creating software is a great example, actually. Coding absolutely requires reasoning. I’ve tried using code-focused LLMs to write blocks of code, or even some basic YAML files, but the output is often unusable.
It rarely makes syntax errors, but it will do things like reference libraries that haven’t been imported or hallucinate functions that don’t exist. It also constantly misunderstands the assignment and creates something that technically works but doesn’t accomplish the intended task.
Maybe if you focus on pro- or anti-AI sources, but if you talk to actual professionals or hobbyists solving actual problems, you'll see very different applications. If you go into it looking for problems, you'll find them, likewise if you go into it for use cases, you'll find them.
If you think of LLMs as something with actual intelligence you're going to be very unimpressed
Artificial sugar is still sugar.
Artificial intelligence implies there is intelligence in some shape or form.
Exactly. The naming of the technology would make you assume it's intelligent. It's not.
Artificial sugar is still sugar.
Because it contains sucrose, fructose or glucose? Because it metabolises the same and matches the glycemic index of sugar?
Because those are all wrong. What's your criteria?
It's like someone who has no formal education but has a high level of confidence and eavesdrops on a lot of random conversations.
You rang?
This is literally just a tokenization artifact. If I asked you how many r’s are in /0x5273/0x7183 you’d be confused too.
Sure, but I definitely wouldn’t confidently answer “two”.
I asked mistral/brave AI and got this response:
How Many Rs in Strawberry
The word "strawberry" contains three "r"s. This simple question has highlighted a limitation in large language models (LLMs), such as GPT-4 and Claude, which often incorrectly count the number of "r"s as two. The error stems from the way these models process text through a process called tokenization, where text is broken down into smaller units called tokens. These tokens do not always correspond directly to individual letters, leading to errors in counting specific letters within words.