this post was submitted on 19 Jan 2024
384 points (98.2% liked)

Technology

59404 readers
2016 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related content.
  3. Be excellent to each another!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, to ask if your bot can be added please contact us.
  9. Check for duplicates before posting, duplicates may be removed

Approved Bots


founded 1 year ago
MODERATORS
 

ChatGPT's new AI store is struggling to keep a lid on all the AI girlfriends::OpenAI: 'We also don’t allow GPTs dedicated to fostering romantic companionship'

you are viewing a single comment's thread
view the rest of the comments
[–] CorrodedCranium@leminal.space 6 points 10 months ago (1 children)

I think you can self host an AI chat not these days

[–] Killer_Tree@sh.itjust.works 15 points 10 months ago (3 children)

You can, and it's easier than you might think! Check out a platform like Oobabooga and find a nice 4-bit quantized LLM of a flavor you prefer. Check out TheBloke on hugging face, they quantized a ton of great LLMs.

[–] meliaesc@lemmy.world 19 points 10 months ago

What the fuck did you just say?

[–] Haha@lemmy.world 4 points 10 months ago* (last edited 10 months ago)

What’s an LLM. Is it a new form of pyramid scheme?

/s

[–] Lemminary@lemmy.world 4 points 10 months ago (1 children)
[–] barsoap@lemm.ee 8 points 10 months ago (3 children)

https://en.wikipedia.org/wiki/Quantization_(signal_processing)

Roughly speaking: The AI equivalent of reducing bitrate. Works quite well if you're only running them in inference mode and don't want to train them as the networks are quite noise-resistant (rounding all weights is, in essence, introducing noise).

[–] wikibot@lemmy.world 2 points 10 months ago

Here's the summary for the wikipedia article you mentioned in your comment:

Quantization, in mathematics and digital signal processing, is the process of mapping input values from a large set (often a continuous set) to output values in a (countable) smaller set, often with a finite number of elements. Rounding and truncation are typical examples of quantization processes. Quantization is involved to some degree in nearly all digital signal processing, as the process of representing a signal in digital form ordinarily involves rounding. Quantization also forms the core of essentially all lossy compression algorithms. The difference between an input value and its quantized value (such as round-off error) is referred to as quantization error.

^to^ ^opt^ ^out^^,^ ^pm^ ^me^ ^'optout'.^ ^article^ ^|^ ^about^

[–] Killer_Tree@sh.itjust.works 2 points 10 months ago (1 children)

Exactly! If you only want to use a Large Language Model (LLM) to run your own local chatbot, then using a quantized version will dramatically improve speed and performance. It also allows consumer hardware to run larger models which would otherwise be prohibitively resource intensive.

[–] Lemminary@lemmy.world 2 points 10 months ago
[–] Lemminary@lemmy.world 1 points 10 months ago

Ah, thanks! I'm only familiar with the word in other contexts so it made a lot of noise.