this post was submitted on 27 Jan 2025
883 points (98.1% liked)

Technology

61206 readers
4282 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related content.
  3. Be excellent to each other!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, to ask if your bot can be added please contact us.
  9. Check for duplicates before posting, duplicates may be removed
  10. Accounts 7 days and younger will have their posts automatically removed.

Approved Bots


founded 2 years ago
MODERATORS
 

cross-posted from: https://lemm.ee/post/53805638

you are viewing a single comment's thread
view the rest of the comments
[–] jlh@lemmy.jlh.name 11 points 3 days ago* (last edited 3 days ago) (1 children)

Nvidia cards were the only GPUs used to train DeepSeek v3 and R1. So, that narrative still superficially holds. Other stocks like TSMC, ASML, and AMD are also down in pre-market.

[–] theunknownmuncher@lemmy.world 15 points 3 days ago (1 children)

Yes, but old and "cheap" ones that were not part of the sanctions.

[–] jlh@lemmy.jlh.name 9 points 3 days ago (1 children)

Ah, fair. I guess it makes sense that Wall Street is questioning the need for these expensive blackwell gpus when the hopper gpus are already so good?

[–] legion02@lemmy.world 7 points 3 days ago (1 children)

It's more that the newer models are going to need less compute to train and run them.

[–] frezik@midwest.social 10 points 3 days ago (1 children)

Right. There's indications of 10x to 100x less compute power needed to train the models to an equivalent level. Not a small thing at all.

[–] NuXCOM_90Percent@lemmy.zip 5 points 3 days ago* (last edited 3 days ago)

Not small but... smaller than you would expect.

Most companies aren't, and shouldn't be, training their own models. Especially with stuff like RAG where you can use the highly trained model with your proprietary offline data with only a minimal performance hit.

What matters is inference and accuracy/validity. Inference being ridiculously cheap (the reason why AI/ML got so popular) and the latter being a whole different can of worms that industry and researchers don't want you to think about (in part because "correct" might still be blatant lies because it is based on human data which is often blatant lies but...).

And for the companies that ARE going to train their own models? They make enough bank that ordering the latest Box from Jensen is a drop in the bucket.


That said, this DOES open the door back up for tiered training and the like where someone might use a cheaper commodity GPU to enhance an off the shelf model with local data or preferences. But it is unclear how much industry cares about that.