falken
joined 9 months ago
@noroute@lemmy.world @yoasif@fedia.io local LLM execution times can be very fast on recent consumer hardware. No need to send anywhere, just like their translation - do it all on-device.
As an example, with no optimization or GPU support, my @frameworkcomputer@fosstodon.org (AMD) generates around 5 characters/sec from a 4 gigabyte pre-quantized model.
@kuneho Meta is not inventing this out of the goodness of it's heart. Just like how Google privacy sandbox is a fruit of a poisoned tree, the idea should be treated with extreme caution. If not, well, the NSA have a great new encryption standard they'd love you to use too.
#paranoid