I have the prompt padding on, without it i get two scenes with just 8 frames. Are you using the v1.5-2 motion model? That one seems to need the additional camera movement loras, otherwise you get very little movement. I went back to the v1.4 motion model, but it kind of stinks for realism. So far, i've only been happy with the text2image workflow. I haven't gotten anything good from img2img.
I'm running an Intel 12900K, 3090 24gb vram. Part of the hand issue may be that I'm pushing the resolution beyond spec, up to 768x960. At that res, I can do 32 frames, plus it interpolates an additional 2 between each generated frame, for a total of 124 frames in the final output. I can go up to 48 frames before hitting out of memory errors, but I start getting two completely different scenes per clip at 48.
Haven't tried adding control net into the mix yet. That's a whole new bag of options that I’m not mentally prepared for.
I've tried using less than 75 tokens (literally just "woman on beach wearing dress") and they weren't coming out much different stability wise than my 300+ token monstrosity prompts that lets me play OCD with fabric patterns and hair length and everything else. So I'm not sure why my experience differs so much from the conventional advice. I think the majority of the jumping is from the dynamic prompts. Here's one that didn't change the prompt per-frame (warning: hands!) and it's much more stable: https://files.catbox.moe/rgjbem.mp4. There's definitely a million knobs to fiddle with in these settings, and it's all changing every day anyway, so it's hard to keep up!
That's just the nature of Stable Diffusion. I didn't prompt anything about eye color, so the models fall back onto internal biases. On average blonde hair = blue eyes, and brown hair = brown eyes.
Hmm.... this should be a webm video. Works (sort of) on desktop, but not in my lemmy phone apps. Back to the drawing board.
Fair observation. I need to do more to overcome the biases in the models.
Exactly. Think of a portrait orientation image: top 25% sky, next 25% head, bottom 50% torso. Will come out way different than top 60% head, bottom 40% chest. Using keywords like "closeup, medium shot, cowboy shot" are less effective for me, but that's what you see in lots of tutorial posts for controlling composition through prompting alone. You can even go crazy with the positioning. Portrait photo split vertically, with head in the left column, body in the right column, will make them lean over or arched back, etc.
You can get a lot of interesting pose variety by messing with the aspect ratio. See also regional prompting to carve out spaces within the larger frame. I find putting head/hair/face prompts in their own region, then scaling that region, to be extremely effective in controlling close-up to wide shot framing.
In the Expanse books, there's a planet called Auberon that has an 8 hour rotation, so 4 hours of light and 4 hours of dark. They decided that "1 day" would be light-dark-light and "1 night" is dark-light-dark. It's really interesting how they describe the way society adapts to the cycle of having a midnight sun and both a midmorning and evening sunset.
I thought i had a good system where each outpost was only exporting 1 solid, 1 liquid, and 1 gas. This allowed me to isolate and sort at the receiving outpost.
The problem occurs when each outposts import & export containers get full. At which point materials flow from the export station, go to the import location where they can’t be unloaded, THEN THEY COME BACK to the original outpost where they get offloaded. You wind up with all the same materials filling both the import and export containers. Now the entire material flow is completely borked, nothing is getting imported, and all you have access to is the stuff thats locally produced.
If you look at the ground map when you land on a "coast" biome, you can see where all the elevation dots go flat. That will help direct you to the ocean if you can't see it over the hills/fog.
Gorgeous!