Technology

71389 readers

4563 users here now

This is a most excellent place for technology news and articles.

Our Rules

Follow the lemmy.world rules.
Only tech related news or articles.
Be excellent to each other!
Mod approved content bots can post up to 10 articles per day.
Threads asking for personal tech support may be deleted.
Politics threads may be removed.
No memes allowed as posts, OK to post as comments.
Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
Check for duplicates before posting, duplicates may be removed
Accounts 7 days and younger will have their posts automatically removed.

Approved Bots

founded 2 years ago

MODERATORS

L3s@lemmy.world

enu@lemmy.world

technopagan@lemmy.world

L4s@lemmy.world

L3s@hackingne.ws

L4s@hackingne.ws

823

ChatGPT 'got absolutely wrecked' by Atari 2600 in beginner's chess match — OpenAI's newest model bamboozled by 1970s logic (www.tomshardware.com)

submitted 5 days ago by Lifecoach5000@lemmy.world to c/technology@lemmy.world

210 comments fedilink hide all child comments

(page 4) 50 comments

sorted by: hot top controversial new old

[–] muntedcrocodile@lemm.ee 3 points 5 days ago* (last edited 4 days ago) (1 children)

This isn't the strength of gpt-o4 the model has been optimised for tool use as an agent. That's why its so good at image gen relative to other models it uses tools to construct an image piece by piece similar to a human. Also probably poor system prompting. A LLM is not a universal thinking machine its a a universal process machine. An LLM understands the process and uses tools to accomplish the process hence its strengths in writing code (especially as an agent).

Its similar to how a monkey is infinitely better at remembering a sequence of numbers than a human ever could but is totally incapable of even comprehending writing down numbers.

[–] cheese_greater@lemmy.world 3 points 5 days ago (2 children)

Do you have a source for that re:monkeys memorizing numerical sequences? What do you mean by that?

[–] shalafi@lemmy.world 2 points 4 days ago

That threw me as well.

load more comments (1 replies)

[–] Pamasich@kbin.earth -2 points 4 days ago (4 children)

Isn't the Atari just a game console, not a chess engine?

Like, Wikipedia doesn't mention anything about the Atari 2600 having a built-in chess engine.

If they were willing to run a chess game on the Atari 2600, why did they not apply the same to ChatGPT? There are custom GPTs which claim to use a stockfish API or play at a similar level.

Like this, it's just unfair. Both platforms are not designed to deal with the task by themselves, but one of them is given the necessary tooling, the other one isn't. No matter what you think of ChatGPT, that's not a fair comparison.

Edit: Given the existing replies and downvotes, I think this comment is being misunderstood. I would like to try clarifying again what I meant here.

First of all, I'd like to ask if this article is satire. That's the only way I can understand the replies I've gotten that critized me on grounds of the marketing aspect of LLMs (when the article never brings up that topic itself, nor did I). Like, if this article is just some tongue in cheek type thing about holding LLMs to the standards they're advertised at, I can understand both the article and the replies I've gotten. But the article never suggests so itself. So my assumption when writing my comment was that this is not the case and it is serious.

The Atari is hardware. It can't play chess on its own. To be able to, you need a game for it which is inserted. Then the Atari can interface with the cartridge and play the game.

ChatGPT is an LLM. Guess what, it also can't play chess on its own. It also needs to interface with a third party tool that enables it to play chess.

Neither the Atari nor ChatGPT can directly, on their own, play chess. This was my core point.

I merely pointed out that it's unfair that one party in this comparison is given the tool it needs (the cartridge), but the other party isn't. Unless this is satire, I don't see how marketing plays a role here at all.

load more comments (4 replies)

load more comments