this post was submitted on 18 Nov 2024
194 points (78.2% liked)
Memes
45746 readers
1657 users here now
Rules:
- Be civil and nice.
- Try not to excessively repost, as a rule of thumb, wait at least 2 months to do it if you have to.
founded 5 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
They do do it in C. The packages are written in C, python is just used as the wrapper to allow less coding skilled data scientists to easily use it.
That's like the entire data science joke. It's C in a python trench coat.
Nearly every languages' every core packages are written in C. And almost every higher packages have some amount of C. That doesn't mean we get to say every program is done in C. And if you keep drilling down, everything is just machine lang. And certainly still disproves the OPs point of inefficient python.
Saying it's all done in C hardly even true. Just look at xformers library on GitHub. Only 2.7% of the code is C. And the entire library is about optimizing.
Additionally, vast majority of the great leaps in ML efficiency changes hasn't come from better programmed packages, though they too certainly made big strides. How we calculate itself has changed. That's what makes the greatest optimizations in anything. It doesn't matter what language it is, doing a loop 1000000 times to add 1 is going to be worse performance than just doing 1 multiplied by 1000000. How we calculate, what we choose to give up (such as determinism in some implementations if SDP attention changes) and such makes big differences.
Optimizations also has to be done by someone. Whether that be data scientists or otherwise. The ability for higher level languages to enable them to do so like you say also makes a big difference. If all the programmers had to optimize in C only, we'd still be way behind where we are now in performance.
Just swapping languages doesn't yield better results like OP is implying.