this post was submitted on 15 Feb 2024

429 points (95.5% liked)

Technology

59190 readers

2588 users here now

This is a most excellent place for technology news and articles.

Our Rules

Follow the lemmy.world rules.
Only tech related content.
Be excellent to each another!
Mod approved content bots can post up to 10 articles per day.
Threads asking for personal tech support may be deleted.
Politics threads may be removed.
No memes allowed as posts, OK to post as comments.
Only approved bots from the list below, to ask if your bot can be added please contact us.
Check for duplicates before posting, duplicates may be removed

Approved Bots

founded 1 year ago

MODERATORS

429

OpenAI introduces Sora, its text-to-video AI model (www.theverge.com)

submitted 8 months ago* (last edited 8 months ago) by catculation@lemmy.zip to c/technology@lemmy.world

141 comments fedilink hide all child comments

https://openai.com/sora

Archive https://archive.is/V8Fv3

you are viewing a single comment's thread
view the rest of the comments

[–] Flumpkin@slrpnk.net 32 points 8 months ago (4 children)

This is still so bizarre to me. I've worked on 3D rendering engines trying to create realistic lighting and even the most advanced 3D games are pretty artificial. And now all of a sudden this stuff is just BAM super realistic. Not just that, but as a game designer you could create an entire game by writing text and some logic.

[–] ArmokGoB@lemmy.dbzer0.com 30 points 8 months ago (3 children)

In my experience as a game designer, the code that LLMs spit out is pretty shit. It won't even compile half the time, and when it does, it won't do what you want without significant changes.

[–] DSTGU@sopuli.xyz 18 points 8 months ago* (last edited 8 months ago) (1 children)

The correct usage of LLMs in coding imo is for a single use case at a time, building up to what you need from scratch. It requires skill both in talking to AI for it to give you what you want, knowing how to build up to it, reading the code it spits out so that you know when it goes south and the skill of actually knowing how to build the bigger picture software from little pieces but if you are an intermediate dev who is stuck on something it is a great help.

That or for rubber ducky debugging, it s also great in that

[–] colonial@lemmy.world 4 points 8 months ago (1 children)

That sounds like more effort than just... writing the code.

[–] DSTGU@sopuli.xyz 3 points 8 months ago

It s situationally useful

[–] kspatlas@lemm.ee 8 points 8 months ago (1 children)

Chatgpt once insisted my JSON was actually YAML

[–] jkrtn@lemmy.ml 6 points 8 months ago

Technically it is, but I agree that is imprecise and nobody would say so IRL. Unless they are being a pedantic nerd, like I am right now.

[+] Flumpkin@slrpnk.net 0 points 8 months ago (2 children)

[deleted]

[–] Traister101@lemmy.today 2 points 8 months ago

You should refine your thoughts more instead of dumping a stream of consciousness on people.

Essentially what this stream of consciousness boils down to is "Wouldn't it be neat if AI generated all the content in the game you are playing on the fly?" Would it be neat? I guess so but I find that incredibly unappealing very similar to how AI art, stories and now video is unappealing. There's no creativity involved. There's no meaning to any of it. Sentient AI could probably have creativity but what people like you who get overly excited about this stuff don't seem to understand is how fundamentally limited our AI actually is currently. LLMs are basically one of the most advanced AI things rn and yet all it does is predict text. It has no knowledge, no capacity for learning. It's very advanced auto correct.

We've seen this kind of hype with Crypto with NFTs and with Metaverse bullshit. You should take a step back and understand what we currently have and how incredibly far away what has you excited actually is.

[–] HeavyDogFeet@lemmy.world 1 points 8 months ago

I don't mean to be dismissive of your entire train of thought (I can't follow a lot of it, probably because I'm not a dev and not familiar with a lot of the concepts you're talking about) but all the things you've described that I can understand would require these tools to be a fuckload better, on an order we haven't even begun to get close to yet, in order to not be super predictable.

It's all wonderful in theory, but we're not even close to what would be needed to even half-ass this stuff.

[–] FatCrab@lemmy.one 6 points 8 months ago

Keep in mind that this isn't creating 3d Billy volumes at all. While immensely impressive, the thing being created by this architecture is a series of 2d frames.

[–] fidodo@lemmy.world 1 points 8 months ago (1 children)

Because it's trained on videos of the real world, not on 3d renderings.

[–] Flumpkin@slrpnk.net 1 points 8 months ago

Lol you don't know how cruel that is. For decades programmers have devoted their passion to creating hyperrealistic games and 3D graphics in general, and now poof it's here like with a magic wand and people say "yeah well you should have made your 3D engine look like the real world, not to look like shit" :D

[+] nucleative@lemmy.world -11 points 8 months ago (6 children)

Welcome to the club my friend... Expert after expert is having this experience as AI develops in the past couple years and we discover that the job can be automated way more than we thought.

First it was the customer service chat agents. Then it was the writers. Then it was the programmers. Then it was the graphic design artists. Now it's the animators.

[–] EnderMB@lemmy.world 33 points 8 months ago (1 children)

Another programmer here. The bottleneck in most jobs isn't in getting boilerplate out, which is where AI excels, it's in that first and/or last 10-20%, alongside dictating what patterns are suitable for your problem, what proprietary tooling you'll need to use, what API's you're hitting and what has changed in recent weeks/months.

What AI is achieving is impressive, but as someone that works in AI, I think that we're seeing a two-fold problem: we're seeing a limit of what these models can accomplish with their training data, and we're seeing employers hedge their bets on weaker output with AI over specialist workers.

The former is a great problem, because this tooling could be adjusted to make workers lives far easier/faster, in the same way that many tools have done so already. The latter is a huge problem, as in many skilled worker industries we've seen waves of layoffs, and years of enshitification resulting in poorer products.

The latter is also where I think we'll see a huge change in culture. IMO, we'll see existing companies bet it all and die from supporting AI over people, and a new wave of companies focus on putting output of a certain standard to take on larger companies.

[–] archomrade@midwest.social 3 points 8 months ago

This is a really balanced take, thank you

[–] HeavyDogFeet@lemmy.world 20 points 8 months ago (1 children)

Writer here, absolutely not having this experience. Generative AI tools are bad at writing, but people generally have a pretty low bar for what they think is good enough.

These things are great if you care about tech demos and not quality of output. If you actually need the end result to be good though, you’re gonna be waiting a while.

[–] NounsAndWords@lemmy.world 5 points 8 months ago (1 children)

If you actually need the end result to be good though, you’re gonna be waiting a while.

I agree with everything you said, but it seems in the context of AI development "a while" is like, a few years.

[–] HeavyDogFeet@lemmy.world 4 points 8 months ago (1 children)

That remains to be seen. We have yet to see one of these things actually get good at anything, so we don’t know how hard that last part is to do. I don’t think we can assume there will be continuous linear progress. Maybe it’ll take one year, maybe it’ll take 10, maybe it’ll just never reach that point.

[–] sudoreboot@slrpnk.net 4 points 8 months ago* (last edited 8 months ago) (1 children)

Yeah a real problem here is how you get an AI which doesn't understand what it is doing to create something complete and still coherent. These clips are cool and all, and so are the tiny essays put out by LLMs, but what you see is literally all you are getting; there are no thoughts, ideas or abstract concepts underlying any of it. There is no meaning or narrative to be found which connects one scene or paragraph to another. It's a puzzle laid out by an idiot following generic instructions.

That which created the woman walking down that street doesn't know what either of those things are, and so it can simply not use those concepts to create a coherent narrative. That job still falls onto the human instructing the AI, and nothing suggests that we are anywhere close to replacing that human glue.

Current AI can not conceptualise -- much less realise -- ideas, and so they can not be creative or create art by any sensible definition. That isn't to say that what is produced using AI can't be posed as, mistaken for, or used to make art. I'd like to see more of that last part and less of the former two, personally.

[–] Flumpkin@slrpnk.net 1 points 8 months ago (1 children)

Current AI can not conceptualise – much less realise – ideas, and so they can not be creative or create art by any sensible definition.

I kinda 100% agree with you on the art part since it can't understand what it's doing... On the other hand, I could swear that if you look at some generated AI imagines it's kind of mocking us. It's a reflection of our society in a weird mirror. Like a completely mad or autistic artist that is creating interesting imagery but has no clue what it means. Of course that exists only in my perception.

But it the sense of "inventive" or "imaginative" or "fertile" I find AI images absolutely creative. As such it's telling us something about the nature of creative process, about the "limits" of human creativity - which is in itself art.

When you sit there thinking up or refining prompts you're basically outsourcing the imaginative visualizing part of your brain. An "AI artist" might not be able draw well or even have the imagination, but he might have a purpose or meaning that he's trying to visualize with the help of AI. So AI generation is at least some portion of the artistic or creative process but not all of it.

Imagine we could have a brain computer interface that lets us perceive virtual reality like with some extra pair of eyes. It could scan our thoughts and allows us to "write text" with our brain, and then immediately feeds back a visual AI generated stream that we "see". You'd be a kind of creative superman. Seeing / imagining things in their head is of course what many people do their whole life but not in that quantity or breadth. You'd hear a joke and you would not just imagine it, you'd see it visualized in many different ways. Or you'd hear a tragedy and...

[–] sudoreboot@slrpnk.net 1 points 8 months ago* (last edited 8 months ago)

Like a completely mad or autistic artist that is creating interesting imagery but has no clue what it means.

Autists usually have no trouble understanding the world around them. Many are just unable to interface with it the way people normally do.

It’s a reflection of our society in a weird mirror.

Well yes, it's trained on human output. Cultural biases and shortcomings in our species will be reflected in what such an AI spits out.

When you sit there thinking up or refining prompts you’re basically outsourcing the imaginative visualizing part of your brain. [...] So AI generation is at least some portion of the artistic or creative process but not all of it.

We use a lot of devices in our daily lives, whether for creative purposes or practical. Every such device is an extension of ourselves; some supplement our intellectual shortcomings, others physical. That doesn't make the devices capable of doing any of the things we do. We just don't attribute actions or agency to our tools the way we do to living things. Current AI possess no more agency than a keyboard does, and since we don't consider our keyboards to be capable of authoring an essay, I don't think one can reasonably say that current AI is, either.

A keyboard doesn't understand the content of our essay, it's just there to translate physical action into digital signals representing keypresses; likewise, an LLM doesn't understand the content of our essay, it's just translating a small body of text into a statistically related (often larger) body of text. An LLM can't create a story any more than our keyboard can create characters on a screen.

Only once/if ever we observe AI behaviour indicative of agency can we start to use words like "creative" in describing its behaviour. For now (and I suspect for quite some time into the future), all we have is sophisticated statistical random content generators.

[–] Traister101@lemmy.today 7 points 8 months ago

Still waiting on the programmer part. In a nutshell AI being say 90% perfect means you have 90% working code IE 10% broken code. Images and video (but not sound) is way easier cause human eyes kinda just suck. Couple of the videos they've released pass even at a pretty long glance. You only notice funny businesses once you look closer.

[–] General_Effort@lemmy.world 4 points 8 months ago

I can't imagine that digital artists/animators have reason to worry. At the upper end, animated movies will simply get flashier, eating up all the productivity gains. In live action, more effects will be pure CGI. At the bottom end, we may see productions hiring VFX artists, just as naturally as they hire makeup artists now.

When something becomes cheaper, people buy more of it, until their demand is satisfied. With food, we are well past that point. I don't think we are anywhere near that point with visual effects.

[–] genesis@kbin.social 2 points 8 months ago

It seems to me that AI won't completely replace jobs (but will do in 10-20 years). But will reduce demand because oversaturation + ultraproductivity with AI. Moreover, AI will continue to improve. A work of a team of 30 people will be done with just 3 people.

[–] Flumpkin@slrpnk.net -1 points 8 months ago (1 children)

Yeah. And it's not just how good the images look it's also the creativity. Everyone tries to downplay this but I've read texts and those videos and just from the prompts there is a "creative spark" there. It's not very bright spark lol but it's there.

I should get into this stuff but I feel old lol. I imagine you could generate interesting levels with obstacles and riddles and "story beats" too.

[–] Ultraviolet@lemmy.world 1 points 8 months ago (1 children)

Because sometimes the generator just replicates bits of its training data wholesale. The "creative spark" isn't its own, it's from a human artist left uncredited and uncompensated.

[–] Flumpkin@slrpnk.net 3 points 8 months ago

Artists are "inspired" by existing art or things they see in real life all the time. So that they can replicate art doesn't mean they can't generate art. It's a non sequitur. But I'm sure people are going to keep insisting on this so lets not argue back and forth on this :D