this post was submitted on 26 Aug 2024
18 points (78.1% liked)

AI Generated Images

7174 readers
88 users here now

Community for AI image generation. Any models are allowed. Creativity is valuable! It is recommended to post the model used for reference, but not a rule.

No explicit violence, gore, or nudity.

This is not a NSFW community although exceptions are sometimes made. Any NSFW posts must be marked as NSFW and may be removed at any moderator's discretion. Any suggestive imagery may be removed at any time.

Refer to https://lemmynsfw.com/ for any NSFW imagery.

No misconduct: Harassment, Abuse or assault, Bullying, Illegal activity, Discrimination, Racism, Trolling, Bigotry.

AI Generated Videos are allowed under the same rules. Photosensitivity warning required for any flashing videos.

To embed images type:

“![](put image url in here)”

Follow all sh.itjust.works rules.


Community Challenge Past Entries

Related communities:

founded 1 year ago
MODERATORS
 

Bing

promt:

Food product. Ready to eat meal that cheap and save time on preparation. Dried food. Feature children in ad. Dinner table, family. The ad should be black and white. Visual of family in the ads should be vintage (1980s) era. Fallout styles ads.

top 5 comments
sorted by: hot top controversial new old
[–] vaultdweller013@sh.itjust.works 7 points 2 months ago

Uh Vault boy ya doin' good? Been sucken up rads through a silly straw?

[–] tal@lemmy.today 2 points 2 months ago* (last edited 2 months ago) (2 children)

I think that there's some way to get either Midjourney or Bing to reliably produce human-specified text labels correctly; I've seen people do it in some images on here.

I use Stable Diffusion, don't know of a way to do that -- I tend to end up with some missing letters and such too -- but I'm pretty sure that I've seen some people here consistently pull it off on some proprietary AI image generator, where they specify the text.

EDIT: Well, there's ControlNet in SD but that's kind of a time-intensive way to do it. You'd need to create the outline of the text ahead-of-time. It can be useful for if you want some elaborate effect that is easy for SD but hard for an image editor, like text formed out of a cloud or something. But for simple text like this, in SD, it's probably easier to just remove the text via one of various methods and then just re-add the desired text in an image editor.

I do hope that improving on this is one of the next things to be generally rolled-out; it's pretty impressive how well existing systems can select and incorporate text. I just want a mechanism that allows more-control over specifying what text shows up.

If anyone here does regularly embed text in their images, what system do you use, and how do you do it?

[–] altima_neo@lemmy.zip 3 points 2 months ago

I've had pretty good success with FLUX, but it can also spit out gibberish. Usually takes a few attempts.

[–] Usernameblankface@lemmy.world 3 points 2 months ago (1 children)

With Bing, I put any words I want to see in quotation marks and run the prompt over and over until I get one that works. Fewer words tend to work better. A string of 5 or 6 words long usually takes multiple tries. Longer than that might not happen at all.

[–] tal@lemmy.today 2 points 2 months ago

I guess that brute-forcing can work.

For images with multiple passages of text, like this one, can maybe combine with inpainting on image generators that provide that (so that once you get one piece text the way you want it, you can leave it alone and go generate others).

There's a technique I saw that someone did, not to solve this problem, but to remove text, was commenting on it a few days ago. Basically, there's good OCR software out there, and it's capable of detecting text of various sorts. So detextify just keeps running OCR software on a generated image detecting text, getting the bounding box on the text from the OCR software, and then re-running an inpaint on that bounding box until the OCR software can't detect any text. It's not incredibly compute-efficient, but it is cheap in terms of human time.

I suppose that as long as the OCR software can handle actually reading the text, it might be possible to use a similar technique, but instead of repeating until the OCR software is unable to find text, repeating until it finds text that matches the desired string.