Also (since it's been a while): there are over 2000 comments in the current thread. To read them all, you need to click More links at the bottom of the page, or like this:
This is insane. But I'm impressed most of all by the quality of motion. I've quite simply never seen convincing computer-generated motion before. Just look at the way the wooly mammoths connect with the ground, and their lumbering mass feels real.
Motion-capture works fine because that's real motion, but every time people try to animate humans and animals, even in big-budget CGI movies, it's always ultimately obviously fake. There are so many subtle things that happen in terms of acceleration and deceleration of all of the different parts of an organism, that no animator ever gets it 100% right. No animation algorithm gets it to a point where it's believable, just where it's "less bad".
But these videos seem to be getting it entirely believable for both people and animals. Which is wild.
And then of course, not to mention that these are entirely believable 3D spaces, with seemingly full object permanence. As opposed to other efforts I've seen which are basically briefly animating a 2D scene to make it seem vaguely 3D.
I disagree, just look at the legs of the woman in the first video. First she seems to be limping, than the legs rotate. The mammoth are totally uncanny for me as its both running and walking at the same time.
Don't get me wrong, it is impressive. But I think many people will be very uncomfortable with such motion very quickly. Same story as the fingers before.
> I think many people will be very uncomfortable with such motion very quickly
So... I think OP's point stands. (impressive, surpasses human/algorithmic animation thus far).
You're also right. There are "tells." But, a tell isn't a tell until we've seen it a few times.
Jaron Lanier makes a point about novel technology. The first gramophone users thought it sounded identical to live orchestra. When very early films depicting a train coming towards a camera, and people fell out of their chairs... Blurry black and white, super slow frame rate projected on a bedsheet.
Early 3d animation was mindblowing in the 90s. Now it seems like a marionette show. Well... I suppose there was a time when marionette shows were not campy. They probably looked magic.
It seems we need some experience before we internalize the tells and it starts to look fake. My own eye for CG images seems to improving faster then the quality. We're all learning to recognize GPT generated text. I'm sure these motion captures will look more fake to us soon.
That said... the fact that we're having this discussion proves that what we have here is "novel." We're looking at a breakthrough in motion/animation.
Also, I'm not sure "real" is necessary. For games or film what we need is rich and believable, not real.
> You're also right. There are "tells." But, a tell isn't a tell until we've seen it a few times.
Once you have seen a few you can tell instantly. They all move at 2 keyframes per second, that makes all movements seem alien and everything in an image moves strangely in sync. The dog moves in slow motion since they need more keyframes etc. That street some looks like they move in slow motion and others not.
People will quickly learn to notice those issues, they aren't even subtle once you are aware of them, not to mention the disappearing things etc.
And that wouldn't be very easy to fix, they need to train it on keyframes because training frame by frame is too much.
But that should make this really easy for others to replicate. You just train on keyframes and then train a model to fill in between keyframes, and you get this. It has some limitations as we see with movement keeping the same pace in every video, but there are a lot of cool results from it anyway.
I have a friend who has worked on many generations of video compression over the last 20 years. He would rather watch a movie on film without effects than anything on a TV or digital theater. He's trained himself to spot defects and now even with the latest HEVC H.265 he finds it impossible to enjoy. It's artifacts all the way down and the work never ends. At the superbowl he was obsessed with blocking for fast objects, screen edge artifacts, flat field colors, and something with the grass.
Luckily, I think he'll retire sooner than later, and maybe it will get better then.
I think a lot of these issues could be "solved" by lowering the resolution, using a low quality compression algorithm, and trimming clips down to under 10 seconds.
And by solved, I mean they'll create convincing clips that'll be hard for people to dismiss unless they're really looking closely. I think it's only a matter of time until fake video clips lead to real life outrage and violence. This tech is going to be militarized before we know it.
Yeah, we are very close to losing video as a source of truth.
I showed these demos to my partner yesterday and she was upset about how real AI has become, how little we will be able to trust what we see in the future. Authoritative sources will be more valuable, but they themselves may struggle to publish only the facts and none of the fiction.
Russian authorities refute his death and have released proof of life footage, which may be doctored or taken before his death. Authoritative source Wikipedia is not much help in establishing truth here, because without proof of death they must default to toeing the official line.
I predict that in the coming months Sokolov (who just yesterday was removed from his post) will re-emerge in the video realm, and go on to have a glorious career. Resurrecting dead heroes is a perfect use of this tech, for states where feeding people lies is preferable to arming them with the truth.
Sokolov may even go on to be the next Russian President.
> Yeah, we are very close to losing video as a source of truth.
I think this way of thinking is distracted. No type of media has ever been a source of truth in itself. Videos have been edited convincingly for a long time, and people can lie about their context or cut them in a way that flips their meaning.
Text is the easiest media to lie on, you can freely just make stuff up as you go, yet we don't say "we cannot trust written text anymore".
Well yeah duh, you can trust no type of media just because it is formatted in a certain way. We arrive at the truth by using multiple sources and judging the sources' track records of the past. AI is not going to change how sourcing works. It might be easier to fool people who have no media literacy, but those people have always been a problem for society.
Text was never looked at a source of truth like video was. If you messaged someone something, they wouldn't necessarily believe it. But if you sent them a video of that something, they would feel that they would have no choice but to believe that something.
> Well yeah duh, you can trust no type of media just because it is formatted in a certain way
Maybe you wouldn't, but the layperson probably would.
> We arrive at the truth by using multiple sources and judging the sources' track records of the past
Again, this is something that the ideal person would, not the average layperson. Almost nobody would go through all that to decide if they want to believe something or not. Presenting them a video of this sometjing would've been a surefire way to force them to believe it though, at least before Sora.
> people have always been a problem for society
Unrelated, but I think this attitude is by far the bigger "problem for society". It encourages us to look down on some people even when we do not know their circumstances or reasons, all for an extremely trivial matter. It encourages gatekeeping and hostility, and I think that kind of attitude is at least as detrimental to society as people with no media literacy.
During a significant part of history, text was definitely considered a source of truth, at least to the extent a lot of people see video now. A fancy recommendation letter from a noble would get you far. It makes sense because if you forge it, that means you had to invest significant amount of effort and therefore you had to plan the deception. It's a different kind of behavior than just lying on a whim.
But even then, as nowadays, people didn't trust the medium absolutely. The possibility of forgery was real, as it has been with the video, even before generative AI.
To back up this claim, when fictional novels first became a literary format in the Western world, there was immense consternation about the fact that un-true things were being said in text. It actually took a while for authors to start writing in anything besides formats that mimicked non-fictional writing (letters, diary entries, etc.).
As has been pointed ad nauseam by now, no one's suggesting that AI unlocks the ability to doctor images; they're suggesting that it makes it trivially easy for anyone, no matter how unskilled, to do so.
I really find this constant back and forth exhausting. It's always the same conversation: '(gen)AI makes it easy to create lots of fake news and disinformation etc.' --> 'but we've always been able to do that. have you not guys not heard of photoshop?' --> 'yes, but not on this scale this quickly. can you not see the difference?'
Anyway, my original point was simply to say that a lot of people have (rightly or wrongly) indeed taken photographic evidence seriously, even in the age of photographic manipulation (which as you point out, pretty much coincides with the age of photography itself).
> Yeah, we are very close to losing video as a source of truth.
Why have you been trusting videos? The only difference is that the cost will decrease.
Haven't you seen Holywood movies? CGI has been convincing enough for a decade. Just add some compression and shaky mobile cam and it would be impossible to tell the difference on anything.
> Yeah, we are very close to losing video as a source of truth.
We've been living in a post-truth society for a while now. Thanks to "the algorithm" interacting with basic human behavior, you can find something somewhere that will tell you anything is true. You'll even find a community of people who'll be more than happy to feed your personal echo chamber -- downvoting & blocking any objections and upvoting and encouraging anything that feeds the beast.
And this doesn't just apply to "dumb people" or "the others", it applies to the very people reading this forum right now. You and me and everybody here lives in their safe, sound truth bubble. Don't like what people tell you? Just find somebody or something that will assure you that whatever it is you think, you are thinking the truth. No, everybody is the asshole who is wrong. Fuck those pond scum spreaders of "misinformation".
It could be a blog, it could be some AI generated video, it could even be "esteemed" newspapers like the New York Times or NPR. Everybody thinks their truth is the correct one and thanks to the selective power of the internet, we can all believe whatever truth we want. And honestly, at this point, I am suspecting there might not be any kind of ground truth. It's bullshit all the way down.
so where do we go from here? the moon landing was faked, we're ruled by lizard people, and there are microchips in the vaccine. at some level, you can believe what you want to believe, and if the checkout clerk thinks the moon is made of cheese, it makes no difference to me, I still get my groceries. but for things like nuclear fusion, are we actually making progress on it or is it also a delusion. where the rubber meets the road is how money gets spent on building big projects. is JWST bullshit? is the LHC? ITER? GPS?
we need ground truths for these things to actually function. how else can things work together?
I've always found that take quite ridiculous. Fake videos have existed for a long time. This technology reduces the effort required but if we're talking about state actors that was never an issue to begin with.
People already know that video cannot be taken at face value. Lord of the rings didn't make anyone belive orcs really exist.
Lord of the Rings had a budget in the high millions and took years to make with a massive advertising campaign.
Riots happen due to out of context video clips. Violence happens due to people seeing grainy phone videos and acting on it immediately. We're reaching a point where these videos can be automatically generated instantly by anyone. If you can't see the difference between anyone with a grudge generating a video that looks realistic enough, and something that requires hundreds of millions of dollars and hundreds of employees to attain similar quality, then you're simply lying.
A key difference in the current trajectory is its becoming feasible to generate highly targeted content down to an individual level. This can also be achieved without state actor level resources or the time delays needed to traditionally implement, regardless of budget. The fact it could also be automated is mildly terrifying.
Coordinated campaigns of hate through the mass media - like kicking up war fever before any major war you care to name - is far more concerning and has already been with us for about a century. Look at WWII and what Hitler was doing with it for a clearest example; propaganda was the name of the game. The techniques haven't gone anywhere.
If anything, making it cheap enough that people have to dismiss video footage might soften the impact. It is interesting how the internet is making it much harder for the mass media to peddle unchallenged lies or slanted perspectives. This tech might counter-intuitively make it harder again.
I have no doubt trust levels will adjust, eventually. The challenge is that takes a non-trivial amount of time.
It's still an issue with traditional mass media. See basically any political environment where the Murdoch media empire is active. The long tail of (I hate myself for this terminology, but hey, it's HN) 'legacy humans' still vote and have a very real affect on society.
It's funny you mention LotR, because the vast vast vast majority of the character effects were practical (at least in the original trilogy). They were in fact, entirely real, even if they were not true to life.
You can still be enraged by things you know are not real. You can reason about your emotional response, but it's much harder to prevent an emotional response from happening in the first place.
The issue is not even so much generating fake videos as creating plausible deniability. Now everything can be questioned for the pure reason of seeming AI-generated.
Yeah, it looks good at first glance. Also the fingers are still weird. And I suppose for every somewhat working vid, there were dozens of garbage. At least that was my experience with image generation.
I don't believe, movie makers are out of buisness any time soon. They will have to incorporate it though. So far this can make convincing background scenery.
> I don't believe, movie makers are out of business any time soon
My son was learning how to play keyboard and he started practicing based on metronome. At some point, I was thinking, why is he learning it at all? We can program which key to be pressed at what point in time, and then a software can play itself! Why bother?
Then it hit me! Musicians could automate all the instruments with incredible accuracy since a long time. But they never do that. For some reason, they still want a person behind the piano / guitar / drums.
Isn't it obvious? Life is about experiences and enjoyment, all of this tech is fun and novel and interesting but realistically, it's really exciting for tech people because it's going to be used to make more computer games, social media posts and advertisements, essentially, it's exciting because it's going to "make money".
Outside of that, people just want to know what it feels like to be able to play their favorite song on guitar and to go skiing etc.
Being perfect at everything would be honestly boring as shit.
I completely agree. There is more to a product than the final result. People who don't play an instrument see music I terms of money. (Hint: there's no money in music). But those who play know that the pleasure is in the playing, and jamming with your mates. Recording and selling are work, not pleasure.
This is true for literally every hobby people do for fun. I am learning ceramics. Everything I've ever made could be bought in a shop for a 100th of the cost, and would be 100 times "better". But I enjoy making the pot, and it's worth more to me than some factory item.
Sona will allow a new hobby, and lots will have fun with it. Pros will still need to fo Pro things. Not everything has to be viewed through the lens of money.
I think it's not. If musicians and only musicians wanted themselves behind instruments, for the sake of being, there should be a market for autogenerated self-playing music machines for their former patrons who wouldn't care. And that's not the case; the market for ambient sound machines is small. It takes equal or more insanity to have one at home than, say, having a military armored car in the garage.
On the other hand you've probably heard of an iPod, which I think I could describe as a device dedicated to give false sense of an ever-present musician, so to speak.
So, "they" in "they still want a person behind the piano" is not just limited to hobbyists and enthusiasts. People wants people behind an instrument, for some reason. People pays for others' suffering, not for a thing's peculiarity.
I don't think this is entirely accurate. There are entire genres of music where the audience does not want a person behind the piano/guitar/drums. Plenty of electronic artists have tried the live band gimmick and while it goes down well with a certain segment of the audience, it turns off another segment that doesn't want to hear "humanized" cover versions of the material. But the point is that both of those audiences exist, and they both have lots of opportunity to hear the music they want to hear. The same will be true of visual art created by computers. Some people will prefer a stronger machine element, other people will prefer a stronger human element, and there is room for us all.
I don't think this is entirely accurate. There are entire genres of music where the audience does not want a person behind the piano/guitar/drums.
Hilariously, nearly every electronic artist I can think of, stands in front of a crowd and "plays "live" by twisting dials etc, so I think it's fairly accurate.
Carl Cox, Tycho, Aphex Twin, Chemical Brothers, Underworld, to name a few.
DJ performances far outnumber "live" performances in the electronic scene. Perhaps you can cherry-pick certain DJs and make a point that they are creating a new musical composition by live-remixing the tracks they play, but even then a significant number of clubbers don't care, they just want to dance to the music. There are venues where a bunch of the audience can't even see the DJ and they still dance because they are enjoying the music on its own merits.
I stand by my original point. There are plenty of people who really do not care if there is a human somewhere "performing" the music or not. And that's totally fine.
Your reasoning is circular. Humans who go to performances of other humans playing instruments enjoy seeing other humans playing instruments. That should not be surprising. The question is whether humans as a whole intrinsically prefer seeing other humans playing instruments over hearing a "perfect" machine reproduction. And the answer to that question is no. There are plenty of humans who really do prefer the machine reproduction.
If you're still talking about whether people want to hear live covers, or recordings, I think it's an apples to oranges comparison therefore I don't see the point in it.
Mainly to pick songs that fit the mood of the audience. At the moment, humans seem to do a better job "reading" the emotions of other humans in this kind of group setting than computers do, and people are willing to pay for experts who have that skill.
An ML model could probably do a good job at selecting tunes of a particular genre that fit into a pre-defined "journey" that the promoter is trying to construct, so I could see a role for "AI DJs" in the future, especially for low budget parties during unpopular timeslots like first day of a festival while people are still arriving and the crew is still setting up. Some of that is already done by just chucking a smart playlist on shuffle. But then you also have up-and-comer or hobbyist DJs who will play for free in those slots, so maybe there's not really a need for a smarter computer to take over the job.
This whole thread started from the question of why a human should do something when a machine can do it better. And the answer is simple: because humans like to do stuff. It is not because humans doing stuff adds some kind of hand-wavey X factor that other humans intrinsically prefer.
Just to be clear, I was talking about the original sound produced by a person (vs. a machine). Of course it was recorded and played back a _lot_ more than folks listening live.
But I take it, maybe I'm not so familiar with world music, I was talking more about Indian music. While the music is recorded and mixed across several tracks electronically, I think most of it is played (or sang) originally by a person.
In the US atleast there's the occasional acoustic song that becomes a hit, but rock music is obviously on its way to slowly becoming jazz status. It and country are really the last genres where live traditional instruments are common during live performances.
Pop, Hip Hop, and EDM basically all are put together as being nearly computer perfect.
All the great producers can play instruments, and that's often times the best way to get a section out initially. But what you hear on Spotify is more and more meticulously put together note by note on a computer after the fact.
Live instruments on stage are now often for spectacle or worse a gimmick, and it's not the song people came to love. I think the future will have people like Lionclad[1] in it pushing what it means to perform live, but I expect them to become fewer and fewer as music just gets more complex to produce overall.
Thankfully, art is not about the least common denominator and I'm confident that there will continue to be music played live as long as humanity exists.
Music has a lot of people who believe that not only is their favorite genre the best but that they must tear down people who don't appreciate it.
You aren't better because you prefer live music, you just have a preference. Music wasn't better some arbitrary number of years ago, you just have a preference.
Nobody said one form is objectively better, just that there is a form that is becoming more popular.
But to state my opinion, I can't imagine something more boring than thinking the best of music, performance, TV, or media in general was done best and created in the past.
It's not that I think my tastes in music are objectively better, it's that I strongly feel that music is a very personal matter for many people and there will be enough people who will seek out different forms of music than what is "popular". Rock, jazz, even classical music, are still alive and well.
> But to state my opinion, I can't imagine something more boring than thinking the best of music, performance, TV, or media in general was done best and created in the past.
And to state my opinion, art isn't about "the best" or any sort of progress, it's about the way we humans experience the world, something I consider to be a timeless preoccupation, which is why a song from 2024 can be equally touching as a ballad from the 14th century.
When I was studying music technology and using state of the art software synthesizers and sequencers, I got more and more into playing my acoustic guitar. There's a deep and direct connection and a pleasure that comes with it that computers (and now/eventually AI) will never be able to match.
(That being said, a realtime AI-based bandmate could be interesting...)
My son is an interesting example of this, I can play all the best guitar music on earth via the speakers, but when I physically get the guitar out and strum it, he sits up like he has just seen god, and is total awe of the sounds of it, the feel of the guitar and the site of it. It's like nothing else can compare. Even if he is hysterically crying, the physical isntrument and the sound of it just makes him calm right down.
I wonder if something is lost in the recording process that just cannot be replicated? A live instrument is something that you can actually feel the sound of IMO, I've never felt the same with recorded music even though I of course enjoy it.
I wonder if when we get older we just get kind of "bored" (sadly) and it doesn't mean as much to us as it probably should.
I'm speculating that one would have more mirror neuron activation watching a person perform live, compared to listening to a recording or watching a video. Thus the missing component that makes live performance special.
For me the guitar is like the keyboard I am writing on right now. It will never be replaced, because that is how I input music into the world. I could not program that, I was doing tracker music as a teenager, and all of the songs sounded weird, because the timing, and so on is not right. And now when I transcribe demos, and put them into a DAW, there seem to be the milliseconds off, that are not quite right. I still play the piano parts live, because we don't have the technology right now to make it sound better than a human, and even if we had, it would not be my music, but what an AI performed.
I really briefly looked at AI in music, lots of wild things are made. It is hard to explain, one was generating a bunch of sliders after mimicking a sample from sine waves (quite accurately)
> Musicians could automate all the instruments with incredible accuracy since a long time. But they never do that. For some reason, they still want a person behind the piano / guitar / drums.
This actually happened on a recent hit, too -- Dua Lipa's Break My Heart. They originally had a drum machine, but then brought in Chad Smith to actually play the drums for it.
Edit: I'm not claiming this was new or unusual, just providing a recent example.
This goes way back. Nine Inch Nails was a synth-first band with the music being written by Trent in a studio on a DAW. That worked but what really made the bad was live shows so they found ways even using 2 drummers to translate the synths and machines into human-plated instruments.
Also way before that back in the early 80’a Depeche Mode displayed the recorded drumb-reel onstage so everyone knew what it was, but when the got big enough they also transitioned into an epic live show with guitars and live drum a as well as synth-hooked drums devices they could bag on in addition to keyboards.
We are human. We want humans. Same reason I want a hipster barista to pour my coffee when a machine could do it just as well.
Same reason I want a hipster barista to pour my coffee when a machine could do it just as well.
I've wondered about this for a long time too, why on earth is anyone still able to be a barista, it turns out, people actually like the community around cafes and often that means interacting with the staff on a personal level.
Some of my best friends have been barista's I've gone to over several years.
It’s more than that, doing it well is still beyond sophisticated automation. Many variables that need do be constantly adjusted for. Humans are still much better at it than machines, regardless of the social element.
If true, probably not for long. Still my point is people are customer. It’s more fun to think about what won’t change. I think we will still have baristas.
A good live performance is intentionally not 100% the same as in the studio, but there can and should be variations. A refrain repeated another time, some improvisation here. Playing with the tempo there. It takes a good band, who know each other intimately, to make that work, though. (a good DJ can also do this with electronic music)
A recorded studio version, I can also listen to at home. But a full band performing in this very moment is a different experience to me.
There are subtle and deliberate deviations in timing and elements like vibrato when a human plays the same song on an instrument twice, which is partly why (aside from recording tech) people prefer live or human musicians.
Think about how precise and exacting a computer can be. It can play the same notes in a MIDI editor with exact timing, always playing note B after 18 seconds of playing note A. Human musicians can't always be that precise in timing, but we seem to prefer how human musicians sound with all of the variations they make. We seem to dislike the precise mechanical repetition of music playback on a computer comparatively.
I think the same point generalises into a general dislike on the part of humans of sensory repetition. We want variety. (Compare the first and second grass pictures at [0] and you will probably find that the second which has more "dirt" and variety looks better.) "Semantic satiation" seems to be a specific case of the same tendency.
I'm not saying that's something a computer can't achieve eventually but it's something that will need to be done before machines can replace musicians.
Yes. I tried that with some software-based synthesisers (like the SWAM violin and Reason's Friktion) which are designed for human-playing (humans controlling the VST through a device that emits MIDI CC control messages) but my understanding is that the modulation that skilled human players perform with tends to be better/more desirable than what software modulators can currently achieve.
The real dilemma is with composition/song-writing.
Ability to create live experiences can still be a motivating factor for musicians (aside from the love of learning). Yet, when AI does the song-writing far more effectively, then will the musician ignore this?
It's like Brave New World. Musicians who don't use these AI tools for song-writing will be like a tribe outside modern world. That's a tough future to prepare for. We won't know whether a song was actually the experience and emotions of a person or not.
Even if we assume that people want fully automated music, the process of learning to play educates the musician. Similarly, you'd still need a director/auteur, editors, writers and other roles I have no appreciation or knowledge of to create a film from AI models.
Steam shovels and modern excavators didn't remove our need for shovels or more importantly, the know-how to properly apply these tools. Naturally, most people use a shovel before they operate an excavator.
It's interesting though, the question really becomes, if 10 people used to shovel manually to feed their family. And now it takes 1 person and an excavater, what in good faith do you tell those other 9..."don't worry you can always be a hobby shovelist?"
They can apply their labor wherever it is valued. Perhaps they will become more productive excavator operators. By creating value in a specialized field their income would increase. Technology does not decrease the need for labor. Rather it increases the productivity of the laborer.
Human ingenuity always finds a need for value creation. Greater abundance creates new opportunities.
Take the inverse position. Should we go back to reading by candlelight to increase employment in candle making?
No, electric lighting allowed peopled to become productive during night hours. A market was created for electricity producers, which allowed additional products which consume electricity to be marketed. Technological increases in productivity cascade into all areas of life, increasing our living standards.
A more interesting, if not controversial line of inquiry might start with: If technology is constantly advancing human productivity, why do modern economies consistently experience price inflation?
You miss the important point, which is the productivity gain means the average living standard of society as a whole increases. A chunk of what is now regarded as 'toil' work disappears, and the time freed up is able to be deployed more productively in other areas.
Of course, this change is dislocating for the particular people whose toil disappeared. They need support to retrain to new occupations.
The alternative is to cling to a past where everyone - on average - is poorer, less healthy, and works in more dangerous jobs.
The ‘augmented singer’ is very popular, though. https://en.wikipedia.org/wiki/Auto-Tune: “Auto-Tune has been widely criticized as indicative of an inability to sing on key.”
Musicians could automate all the instruments with incredible accuracy since a long time. But they never do that. For some reason, they still want a person behind the piano / guitar / drums.
You've never been to a rave, huh? For that matter, there's a lot of pop artists that use sequencers and dispense with the traditional band on stage.
I can see this being used extensively for short commercials, as the uncanny aspect of a lot of the figures will help to capture people's attention. I don't necessarily believe it will be less expensive than hiring a director and film crew however.
> I love these hot takes based on profoundly incredible tech that literally just launched. Acting like 2030 isn't around the corner.
It seems bizarre to think the gee whiz factor in a new commercial creative product makes critiquing its output out-of-bounds. This isn't a university research team: they're charging money for this. Most people have to determine if something is useful before they pay for it.
Yeah. I think people nowadays are in a kind of AI-euphoria and they took every advancement in AI for more than what they really are. The realization of their limitations will set in once people have been working long enough on the stuff. The capacity of the newfangled AIs are impressive. But even more impressive are their mimicry capabilities.
Not dismissing, but being realistic. I observed all the AI tools, usually amaze most people initially by showing capabilities never seen before. Then people realise their limitations, ie what capabilities are still missing. And they're like: "oh, this is no genie in a bottle capable of satisfying every wish. We'll still have to work to obtain our vision..." So the magic fades away, and the world returns to normal, but now with an additional tool very useful in some situations :)
I agree. Skepticism usually serves people well as a lot of new tech turns out to be just hype. Except when it is not and I think this is one of those few cases.
AI won't make artistic decisions that wow an audience.
AI won't teach you something about the human condition.
AI will only enable higher quarterly profits from layoffs until GPU costs catch up.
What the fuck is the point of AI automating away jobs when the only people who benefit are the already enormously wealthy? AI won't be providing time to relax for the average worker, it will induce starvation. Anything to prevent will be stopped via lobbying to ensure taxes don't rise.
Seriously, what is the point? What is the point? What the fuck is there to live for when art and humanities is undermined by the MBA class and all you fucking have is 3 gig jobs to prevent starvation?
I believe ai and full automatisation is critical for a Star Trek society.
We are not very good in providing anything reasonable today because capitalism is still way to strong and manual laber still way to necessary.
Nonetheless look at my country Germany: we are a social state. Plenty of people get 'free' money and it works.
The other good thing: there are plenty of people who know what good is (good art etc) but are not able to draw. The can also express themselves. AI as a tool.
If we as society discover that there will be no really new music or art happening I don't know what we will do.
Plenty of people are well entertained with crap anyway.
I would not. Five (six, seven?) years ago, we had style transfer with video and everyone was also super euphoric about that. If I compare to those videos, there is clearly progress but it is not like we started from zero 2 years ago.
It means "extremely happy", but it's usually used to refer to a particular moment in time (rather than a general sentiment), and so the word sounds a bit out of place here, to me.
"The camera follows behind a white vintage SUV with a black roof": The letters clearly wobble inconsistently.
"A drone camera circles around a beautiful historic church built on a rocky outcropping along the Amalfi Coast": The woman in the white dress in the bottom left suddenly splits into multiple people like she was a single cell microbe multiplying.
On the other hand, this is the first convincing use of a “diffusion transformer” [1]. My understanding is that videos and images are tokenized into patches, through a process that compresses the video/images into abstracted concepts in latent space. Those patches (image/video concepts in latent space) can then be used with transformers (because patches are the tokens). The point is that there is plenty of room for optimization following the first demonstration of a new architecture.
Edit: sorry, it’s not the first diffusion transformer. That would be [2]
Really makes me think of The Matrix scene with the woman in the red dress.
Can't tell if they did this on purpose to freak us all out?
Are we all just prompts?
If you watch the background, you'll see one guy has hits pants change color. And also, some of the guys are absolute giants compared to people around them.
Yep. If you look at the detail you can find obvious things wrong and these are limited to 60s in length with zero audio so I doubt full motion picture movies are going to be replaced anytime soon. B-roll background video or AI generated backgrounds for a green screen sure.
I would expect any subscription to use this service when it comes out to be very expensive. At some point I have to imagine the GPU/CPU horsepower needed will outweigh the monetary costs that could be recovered. Storage costs too. Its much easier to tinker with generating text or static images in that regard.
Of note: NVDA's quarterly results come out next week.
This is weird to me considering how much better this is than the SOTA still images 2 years ago. Even though there's weirdo artefacts in several of their example videos (indeed including migrating fingers), that stuff will be super easy to clean up, just as it is now for stills. And it's not going to stop improving.
When others create text to video systems (eg. Lumiere from Google) they publish the research (eg. https://arxiv.org/pdf/2401.12945.pdf). Open AI is all about commercialization. I don't like their attitude
Google is hardly a good actor here. They just announced Gemini 1.5 along with a "technical report" [1] whose entire description of the model architecture is: "Gemini 1.5 Pro is a sparse mixture-of-expert (MoE) Transformer-based model". Followed by a list of papers that it "builds on", followed by a definition of MoE. I suppose that's more than OpenAI gave in their GPT-4 technical report. But not by much!
The report and the previous one for 1.0 definitely contain much more information than the GPT-4 whitepaper. And Google regularly publishes technical details on other models, like Lumiere, things that OpenAI stopped doing after their InstructGPT paper.
Maybe because GPT3.5 is closer to what Gemini 1.0 was... GPT4 and Gemini 1.5 are similarly sparse in their "how we did it and what we used" when it comes to papers
Not to be overly cute, but if the cutting edge research you do is maybe changing the world fundamentally, forever, guarding that tech should be really, really, really far up your list of priorities and everyone else should be really happy about your priorities.
And that should probably take precedence over the semantics of your moniker, every single time (even if hn continues to be super sour about it)
Nukes aren’t even close to being commodities, cannot be targeted at a class of people (or a single person), and have a minutely small number of users. (Don’t argue semantics with “class of people” when you know what I mean, btw)
On the other hand, tech like this can easily become as common as photoshop, can cause harm to a class of people, and be deployed on a whim by an untrained army of malevolent individuals or groups.
So if someone discovered a weapon of mass destruction (say some kind of supervirus) that could be produced and bought cheaply and could be programmed to only kill a certain class of people, then you'd want the recipe to be freely available?
This poses no direct threat to human life though. (Unlike, say, guns - which are totally fine for everyone in the US!)
The direct threat to society is actually this kind of secrecy.
If ordinary people don't have access to the technology they don't really know what it can do, so they can't develop a good sense of what could now be fake that only a couple of years ago must have been real.
Imagine if image editing technology (Photoshop etc) had been restricted to nation states and large powerful corporations. The general public would be so easy to fool with mere photographs - and of course more openly nefarious groups would have found ways to use it anyway. Instead everybody now knows how easily we can edit an image and if we see a shot of Mr Trump apparently sharing a loving embrace with Mr Putin we can make the correct judgement regarding a probable origin.
The bottleneck for bioterrorism isn't AI telling you how to do something, it's producing the final result. You wanna curtail bioweapons, monitor the BSL labs, biowarfare labs, bioreactors, and organic 3D printers. ChatGPT telling me how to shoot someone isn't gonna help me if I can't get a gun.
This isn't related to my comment. I wasn't asking what if an AI invents a supervirus. I was asking what if someone invents a supervirus. AI isn't involved in this hypothetical in any way.
I was replying to a comment saying that nukes aren't commodities and can't target specific classes of people, and I don't understand why those properties in particular mean access to nukes should be kept secret and controlled.
I understand your perspective regarding the potential risks associated with freely available research, particularly when it comes to illegal weapons and dangerous viruses. However, it's worth considering that by making research available to the world, we enable a collaborative effort in finding solutions and antidotes to such threats. In the case of Covid, the open sharing of information led to the development of vaccines in record time.
It's important to weigh the benefits of diversity and open competition against the risks of bad actors misusing the tools. Ultimately, finding a balance between accessibility and responsible use is key.
What guarantee do we have that OpenAI won't become an evil actor like Skynet?
I'm not advocating for or against secrecy. I'm just not understanding the parent comment I replied to. They said nukes are different than AI because they aren't commodities and can't target specific classes of people, and presumably that's why nukes should be kept secret and AI should be open. Why? That makes no sense to me. If nukes had those qualities, I'd definitely want them kept secret and controlled.
An AI video generator can't kill billions of people, for one. I'd prefer it if access wasn't limited to a single corporation that's accountable to no one and is incentivized to use it for their benefits only.
What do you mean? Are you being dramatic or do you actually believe that the US government will/can not absolutely shut OpenAI down, if they feel it was required to guarantee state order?
For the US government to step in, they'd have to do something extremely dangerous (and refuse to share with the government). If we're talking about video generation, the benefits they have are financial, and the lack of accountability is in that they can do things no one else can. I'm not saying they'll be allowed to break the law, there's plenty of space between the two extremes. Though, given how things were going, I can also see OpenAI teaming up with the US government and receiving exclusive privileges to run certain technologies for the sake of "safety". It's what Altman has already been pushing for.
> The right sequence of videos sent to the right people could definitely set something catastrophic off.
...after amazing public world wide demos that show how real the AI generated videos can be? How long has Hollywood had similar "fictional videos" powers?
I think that's great. Billy will feed his flat earther friends for a few weeks or months and pretty soon the entire world will wise up and be highly skeptical of any new such videos. The more of this that gets out there, the quicker people will learn. If it's 1 or 2 videos to spin an election... People might not get wise to it.
Video can convince people to kill each other now because it is assumed to show real things. Show people a Jew killing a Palestinian, and that will rile up the Muslims, or vice versa.
When a significant fraction of video is generated content spat out by a bored teenager on 4chan, then people will stop trusting it, and hence it will no longer have the power to convince people to kill.
You don't need to generate fake videos for that example. State of Isreal have been killing Palestinians en masse for a long time and intensified the effort for the last 4 months. The death toll is 29,000+ and counting. Two thirds are children and women.
Isreal media machinery parading photographs of damaged houses that could only be done by heavy artillery or tank shells blaming on rebels carrying infantry rifles.
But I agree, as if the current tools were not enough to sway people they will have more means to sway public opinion.
Hamas has similarly been shooting rockets into Israel for a long time. Eventually people get tired and stop caring about long-lasting conflicts, just like we don't care about concentration camps in North Korea and China, or various deadly civil wars in Sub-Saharan Africa, some of which have killed way more civilians than all wars in Palestinian history. One can already see support towards Ukraine fading as well, even though there Western countries would have a real geopolitical interest.
> Especially considering that the biggest killer app for AI could very well be smart weapons like we've never seen before.
A homing missile that chases you across continents and shows you disturbing deepfakes of yourself until you lose your mind and ask it to kill you. At that point it switches to encourage mode, rebuilds your ego, and becomes your lifelong friend.
I don't think it's really that hard to make a nuclear weapon, honestly. Just because you have the plans for one, doesn't mean you have the uranium/plutonium to make one. Weapons-grade uranium doesn't fall into your lap.
The ideas of critical mass, prompt fission, and uranium purification, along with the design of the simplest nuclear weapon possible has been out in the public domain for a long time.
While it's probably too idealistic to be possible, I'd rather try and focus on getting people/society/the world to a state where it doesn't matter if everyone has access (i.e. getting to a place where it doesn't matter if everyone has access to nuclear weapons, guns, chemical weapons, etc., because no-one would have the slightest desire to use them).
As things are at the moment, while supression of a technology has benefits, it seems like a risky long-term solution. All it takes is for a single world-altering technology to slip through the cracks, and a bad actor could then forever change the world with it.
As long as destroying things remains at least two magnitudes easier than building things and defending against attacks, this take (as a blanket statement) will continue to be indefensible and irresponsible.
ML models of this complexity are just as accessible as nuclear weapons. How many nations possess a GPT-4? The only reason nuclear weapons are not more common is because their proliferation is strictly controlled by conventions and covert action.
The basic designs for workable (although inefficient) nuclear weapons have been published in open sources for decades. The hard part is obtaining enough uranium and then refining it.
If you have two pieces of plutonium and put them too close together you have accidentally created a nuclear weapon… so yeah nukes are open source, plutonium breeding isn’t.
I love it when people make this “nuke” argument because it tells you a lot more about them than it does about anything else. There are so many low information people out there, it’s a bit sad the state of education even in developed countries. There’s people trotting around the word “chemical” at things that are scary without understanding what exactly the word means, how it differs from the word mixture or anything like that. I don’t expect most people to understand the difference between a proton and a quark but at least a general understanding of physics and chemistry would save a lot of people from falling into the “world is magic and information is hidden away inside geniuses” mentality.
New technology will always be new giants to see from, but open source really is a nice ladder up to the shoulders of giants. So many benefits from sharing the tech
This is meaningless until you've defined "world changing". It's possible that open sourcing AIs will be world-changing in a good way and developing closed source AIs will be world-changing in a bad way.
If I engineered the tech I would be much more fearful of the possibility of malice in the future leadership of the organization I'm under if they continue to keep it closed, than I would be fearful of the whole world getting the capability if they decide to open source.
I feel that, like with Yellow Journalism of the 1920s, much of the misinformation problem with generative AI will only be mitigated during widespread proliferation, wherein people become immune to new tactics and gain a new skepticism of the media. I've always thought it strange when news outlets discuss new deepfakes but refuse to show it, even with a watermark indicating it is fake. Misinformation research shows that people become more skeptical once they learn about the technological measures (e.g. buying karma-farmed Reddit accounts, or in the 1920s, taking advantage of dramatically lower newspaper printing costs to print sensationalism) through which misinformation is manufactured.
It will be kind of like most of history where the only trustworthy method of communication is with face to face communication or with a letter or book (perhaps cryptographically) verified from a person you personally know or trust. Sounds good to me
How convenient for all the OpenAI employees trying to make millions of dollars by commercializing their technology. Surely this technology won’t be well-understood and easily replicable in a few years as FOSS
From what I remember reading, Open was never supposed to be like open source with the internals freely available, but Open as in available for the public to use, as opposed to a technology only for the company to wield and create content with.
When the time for making money comes, if you don’t think OpenAI will sell every drop of information they have on you, then you are incredibly naive. Why would they leave money on the table when everyone else has been doing it for forever without any adverse effects?
They are currently hiring people with Adtech experience.
The most simple version would be an ad-supported ChatGPT experience. Anyone thinking that an internet consumer company with 100m weekly active users (I‘m citing from their job ad) is not going to sell ads is lacking imagination.
If Google Workspace was selling my or any customers information, at all or "forever", it would not be called Google Workspace, it would be called Google We-died-in-the-most-expensive-lawsuit-of-all-time.
There's a difference. Open AI essentially has 2 products. The chat bot $20 a month thing for Joe shmoe which they admit to training on your prompts, and the API for businesses. Workspace is like the latter. The former is closer to Google search.
Sure, but there is no ambiguity about that, is there? You know that, because they tell you (and, sure, maybe they only tell you, because they have to, by law – but they do and you know)
How do we get from there to "just assume every company in the world will sell your data in wildly and obviously illegal ways", I don't know.
You'd be too late. You're just waiting for someone to imbue a model with agency. We have agency due to evolution. Robots need it programmed into them, and honestly, that is easy to do compared with instilling reasoning. Primitive animals have agency. No animal can reason on the level of GPT. That will get us to HAL2000. If you stick it in a robot, you have the Terminator.
AI doesn’t exist. Neither in practice nor theoretically. Artificial intelligence is an oxymoron. Intelligence is a complex system. Artificial systems are logic systems. You live in a complex universe that you cannot perceive, i.e. we perceive it as noise/randomness only. All you can see are the logical systems expressed at the surface (Mendelbrot Set) of the noise. Everything you see and know is strictly logical, all knowns laws of the universe are derived from those logical systems. Hence, we can only build logical systems. Not complex systems. There is a limit to what we can build here on the surface (Church-Turing). We never have and never will build a complex system.
> Motion-capture works fine because that's real motion
Except in games where they mo-cap at a frame rate less than what it will be rendered at and just interpolate between mo-cap samples, which makes snappy movements turn into smooth movements and motions end up in the uncanny valley.
It's especially noticeable when a character is talking and makes a "P" sound. In a "P", your lips basically "pop" open. But if the motion is smoothed out, it gives the lips the look of making an "mm" sound. The lips of someone saying "post" looks like "most".
At 30 fps, it's unnoticeable. At 144 fps, it's jarring once you see it and can't unsee it.
You might just be subject to confirmation bias here. Perhaps there were scenes and entities you didn't realize were CGI due to high quality animation, and thus didn't account for them in your assessment.
Regarding CGI, I think it has became so good that you don’t know it’s CGI. Look at the dog in Guardians of the Galaxy 3. There’s a whole series on YouTube called “no cgi is really just invisible cgi” that I recommend watching.
And as with cgi, models like SORA will get better until you can’t tell reality apart. It's not there Yet, but an immense astonishingly breakthrough.
Serious: Can one just pipe an SRT (subtitle file) and then tell it to compare its version to the mp4 and then be able to command it to zoom, enhance, edit, and basically use it to remould content. I think this sounds great!
It's possible that through sheer volume of training, the neural network essentially has a 3D engine going on, or at least picked up enough of the rules of light and shape and physics to look the same as unreal or unity
I'm not sure I feel the same way about the mammoths - and the billowing snow makes no sense as someone who grew up in a snowy area. If the snow was powder maybe but that's not what's depicted on the ground.
Main Pixar characters are all computer animated by humans. Physics effects like water, hair, clothing, smoke and background crowds use computer physics simulation but there are handles allowing an animator to direct the motion as per the directors wishes.
> I've quite simply never seen convincing computer-generated motion before
I’m fairly sure you have seen it many times, it was just so convincing that you didn’t realize it was CGI. It’s a fundamentally biased way to sample it, as you won’t see examples of well executed stuff.
Nah this still has the problem with connecting surfaces that never seems to look right in any CGI. It's actually interesting that it doesn't look right here as well considering they are completely different techniques.
Just setup a family password last week...Now it seems every member of the family will have to become their own certificate authority and carry an MFA device.
Don't think of them as "computer-generated" any more than your phone's heavily processed pictures are "computer-generated", or JWST's false color, IR-to-visible pictures are "computer-generated".
I think the implications go much further than just the image/video considerations.
This model shows a very good (albeit not perfect) understanding of the physics of objects and relationships between them. The announcement mentions this several times.
The OpenAI blog post lists "Archeologists discover a generic plastic chair in the desert, excavating and dusting it with great care." as one of the "failed" cases.
But this (and "Reflections in the window of a train traveling through the Tokyo suburbs.") seem to me to be 2 of the most important examples.
- In the Tokyo one, the model is smart enough to figure out that on a train, the reflection would be of a passenger, and the passenger has Asian traits since this is Tokyo.
- In the chair one, OpenAI says the model failed to model the physics of the object (which hints that it did try to, which is not how the early diffusion models worked ; they just tried to generate "plausible" images). And we can see one of the archeologists basically chasing the chair down to grab it, which does correctly model the interaction with a floating object.
I think we can't underestimate how crucial that is to the building of a general model that has a strong model of the world. Not just a "theory of mind", but a litteral understanding of "what will happen next", independently of "what would a human say would happen next" (which is what the usual text-based models seem to do).
This is going to be much more important, IMO, than the video aspect.
Wouldn't having a good understanding of physics mean you know that a women doesn't slide down the road when she walks? Wouldn't it know that a woolly mammoth doesn't emit profuse amounts steam when walking on frozen snow? Wouldn't the model know that legs are solid objects in which other object cannot pass through?
Maybe I'm missing the big picture here, but the above and all the weird spatial errors, like miniaturization of people make me think you're wrong.
Clearly the model is an achievement and doing something interesting to produce these videos, and they are pretty cool, but understanding physics seems like quite a stretch?
I also don't really get the excitement about the girl on the train in Tokyo:
In the Tokyo one, the model is smart enough to figure out that on a train, the reflection would be of a passenger, and the passenger has Asian traits since this is Tokyo
I don't know a lot about how this model works personally, but I'm guessing in the training data the vast majority of people riding trains in Tokyo featured asian people in them, assuming this model works on statistics like all of the other models I've seen recently from Open AI, then why is it interesting the girl in the reflection was Asian? Did you not expect that?
> Wouldn't having a good understanding of physics mean you know that a women doesn't slide down the road when she walks? Wouldn't it know that a woolly mammoth doesn't emit profuse amounts steam when walking on frozen snow? Wouldn't the model know that legs are solid objects in which other object cannot pass through?
This just hit me but humans do not have a good understanding of physics; or maybe most of humans have no understanding of physics. We just observe and recognize whether it's familiar or not.
AI will need to be, that being the case, way more powerful than a human mind. Maybe orders of magnitude more "neural networks" than a human brain has.
Well we feel the world, it's pretty wild when you think about how much data the body must be receiving and processing constantly.
I was watching my child in the bath the other day, they were having the most incredible time splashing, feeling the water, throwing balls up and down, and yes, they have absolutely no knowledge of "physics" yet navigating and interacting with it as if it was the best thing they've ever done. Not even 12 months old yet.
It was all just happening on feel and yeah, I doubt they could describe how to generate a movie.
Operating a human takes an incredible intuition of physics, just because you can't write or explain the math doesn't mean your mind doesn't understand it. Further to that, we are able to apply our patterns of physics to novel external situations on the fly sometimes within miliseconds of encountering the situation.
You only need to see a ball bounce once and your brain has done some rough approximations of it's properties and will calc both where it's going and how to get your gangly menagerie pivots, levers, meat servos and sockets to intercept them at just the right time.
Think also about how well people can come to understand the physics of cars and bikes in motorsport and the like. The internal model of a cars suspension in operation is non-trivial but people can put it in their head.
Humans have an intuitive understanding of physics, not a mathy science one.
I know I can't put my hand through solid objects. I know that if I drop my laptop from chest height it will likely break it, the display will crack or shatter, the case will get a dent. If it hits my foot it will hurt. Depending on the angle it may break a bone. It may even draw blood. All of that is from my intuitive knowledge of physics. No book smarts needed.
I agree, to me the most clear example is how the rocks in the sea vanish/transform after the wave: The generated frames are hyperreal for sure, but the represented space looks as consistent as a dream.
> very good... understanding of the physics of objects and relationships between them
I am always torn here. A real physics engine has a better "understanding" but I suspect that word applies to neither Sora nor a physics engine:
https://www.wikipedia.org/wiki/Chinese_room
An understanding of physics would entail asking this generative network to invert gravity, change the density or energy output of something, or atypically reduce a coefficient of friction partway through a video. Perhaps Sora can handle these, but I suspect it is mimicking the usual world rather than understanding physics in any strong sense.
None of which is to say their accomplishment isn't impressive. Only that "understand" merits particularly careful use these days.
Question is - how much do you need to understand something in order to mimick it?
The Chinese Room seems to however point to some sort of prewritten if-else type of algorithm type of situation. E.g. someone following scripted algorithmic procedures might not understand the content, but obviously this simplification is not the case with LLMs or this video generation, as the algorithmic scripting requires pre-written scripts.
Chinese Room seems to more refer to cases like "if someone tells me "xyz", then respond with "abc" - of course then you don't understand what xyz or abc mean, but it's not referring to neural networks training on ton of material to build this model representation of things.
Perhaps building the representation is building understanding. But humans did that for Sora and for all the other architectures too (if you'll allow a little meta-building).
But evaluation alone is not understanding. Evaluation is merely following a rote sequence of operations, just like the physics engine or the Chinese room.
People recognize this distinction all the time when kids memorize mathematical steps in elementary school but they do not yet know which specific steps to apply for a particular problem. This kid does not yet understand because this kid guesses. Sora just happens to guess with an incredibly complicated set of steps.
I think this is a good insight. But if the kid gets sufficiently good at guessing, does it matter anymore..?
I mean, at this point the question is so vague… maybe it’s kinda silly. But I do think that there’s some point of “good-at-guessing” that makes an LLM just as valuable as humans for most things, honestly.
That matches how philosophers typically talk about the Chinese room. However the Chinese room is supposed to "behaves as if it understands Chinese" and can engage in a conversation (let us assume via text). To do this the room must "remember" previously mentioned facts, people, etc. Furthermore it must line up ambiguous references correctly (both in reading and writing).
As we now know from more than 60 years of good old fashioned AI efforts, plus recent learning based AI, this CAN be done using computers but CANNOT be done using just ordinary if - then - else type rules no matter how complicated. Searle wrote before we had any systems that could actually (behave as if they) understood language and could converse like humans, so he can be forgiven for failing to understand this.
Now that we do know how to build these systems, we can still imagine a Chinese room. The little guy in the room will still be "following pre-written scripted algorithmic procedures." He'll have archives of billions of weights for his "dictionary". He will have to translate each character he "reads" into one or more vectors of hundreds or thousands of numbers, perform billions of matrix multiplies on the results, and translate the output of the calculations -- more vectors -- into characters to reply. (We may come up with something better, but the brain can clearly do something very much like this.)
Of course this will take the guy hundreds or thousands of years from "reading" some Chinese to "writing" a reply. Realistically if we use error correcting codes to handle his inevitable mistakes that will increase the time greatly.
Implication: Once we expand our image of the Chinese room enough to actually fulfill Searle's requirements, I can no longer imagine the actual system concretely, and I'm not convinced that the ROOM ITSELF "doesn't have a mind" that somehow emerges from the interaction of all these vectors and weights.
Too bad Searle is dead, I'd love to have his reply to this.
I found the one about the people in Lagos pretty funny. The camera does about a 360deg spin in total, in the beginning there are markets, then suddenly there are skyscrapers in the background. So there's only very limited object permanence.
> A beautiful homemade video showing the people of Lagos, Nigeria in the year 2056. Shot with a mobile phone camera.
The thing is -- over time I'm not sure people will care. People will adapt to these kinds of strange things and normalize them -- as long as they are compelling visually. The thing about that scene is it looks weird only if you think about it. Otherwise it seems like the sort of pan you would see in some 30 second commercial for coffee or something.
If anything it tells a story: going from market, to people talking as friends, to the giant world (of Lagos).
My instagram feed is full of AI people, I can tell with pretty good accuracy when the image is "AI" or real, the lighting and just the framing and the scene itself, just something is off.
I think a similar thing will happen here, over the next few months we'll adapt to these videos and the problems will become very obvious.
When I first looked at the videos I was quite impressed, but I looked again and I saw a bunch of werid stuff going on. I think our brains are just wired to save energy, and accepting whatever we see on a video or an image as being good enough is pretty efficient / low risk thing.
Agreed, at first glance of the woman walking I was so focused on how well they were animating that the surreal scene went unnoticed. Once I'd stopped noticing the surreal scene, I started picking up on weird motion in the walk too.
Where I think this will get used a lot is in advertising. Short videos, lots going on, see it once and it's gone, no time to inspect. Lady laughing with salad pans to a beach scene, here's a product, buy and be as happy as salad lady.
This will be classified unconsciously as cheap and uninteresting by the brain real quick. It'll have its place in the tides of cheap content, but if overall quality was to be overlooked that easily, producers would never have increased production budget that much, ever, just for the sake of it.
In the video of the girl walking down the Tokyo city street, she's wearing a leather jacket. After the closeup on her face they pull back and the leather jacket has hilariously large lapels that weren't there before.
Object permanence (just from images/video) seems like a particularly hard problem for a super-smart prediction engine. Is it the old thing, or a new thing?
There are also perspective issues: the relative sizes of the foreground (the people sitting at the café) and the background (the market) are incoherent. Same with the "snowy Tokyo with cherry blossoms" video.
Though I'm not sure your point here -- outside of America -- in Asia and Africa -- these sorts of markets mixed in with skyscrapers are perfectly normal. There is nothing unusual about it.
It just computes next frame based on current one and what it learned before, it's a plausible continuation.
In the same way, ChatGPT struggles with math without code interpreter, Sora won't have accurate physics without a physics engine and rendering 3d objects.
Now it's just a "what is the next frame of this 2D image" model plus some textual context.
> It just computes next frame based on current one and what it learned before, it's a plausible continuation.
...
> Now it's just a "what is the next frame of this 2D image" model plus some textual context.
This is incorrect. Sora is not an autoregressive model like GPT, but a diffusion transformer. From the technical report[1], it is clear that it predicts the entire sequence of spatiotemporal patches at once.
> Sora currently exhibits numerous limitations as a simulator. For example, it does not accurately model the physics of many basic interactions, like glass shattering. Other interactions, like eating food, do not always yield correct changes in object states
Regardless whether all the frames are generated at once, or one by one, you can see in their examples it's still just pixel based. See the first example with the dog with blue hat, the woman has a blue thing suddenly spawn into her hand because her hand went over another blue area of the image.
I'm not denying that there are obvious limitations. However, attributing them to being "pixel-based" seems misguided. First off, the model acts in latent space, not directly on pixels. Secondly, there is no fundamental limitation here. The model has already acquired limited-yet-impressive ability to understand movement, texture, social behavior, etc., just from watching videos.
I learned to understand reality by interpreting photons and various sensory inputs. Does that make my model of reality fundamentally flawed? In the sense that I only have a partial intuitive understanding of it, yes. But I don't need to know Maxwell's equations to get a sense of what happens when I open the blinds or turn on my phone.
I think many of the limitations we are seeing here - poor glass physics, flawed object permanence - will be overcome given enough training data and compute.
We will most likely need to incorporate exploration, but we can get really far with astute observation.
Actually your comment gives me hope that we will never have AI singularity, since how the brain works is flawed, and were trying to copy it.
Heck a super AI might not even be possible, what if we're peak intelligence with our millions of years of evolution?
Just adding compute speed will not help much -- say the goal of an intelligence is to win a war. If you're tasked with it then it doesn't matter if you have a month or a decade (assume that time is.frozen while you do your research), its a too complex problem and simply cannot be solved, and the same goes for an AI.
Or it will be like with chess solvers, machines will be more intelligent than us simply because they can load much more context to solve a problem than us in their "working memory"
> Actually your comment gives me hope that we will never have AI singularity, since how the brain works is flawed, and were trying to copy it.
As someone working in the field, the vast majority of AI research isn't concerned with copying the brain, simply with building solutions that work better than what came before. Biomimetism is actually quite limited in practice.
The idea of observing the world in motion in order to internalize some of its properties is a very general one. There are countless ways to concretize it; child development is but one of them.
> If you're tasked with it then it doesn't matter if you have a month or a decade (assume that time is.frozen while you do your research), its a too complex problem and simply cannot be solved, and the same goes for an AI.
I highly disagree.
Let's assume a superintelligent AI can break down a problem into subproblems recursively, find patterns and loopholes in absurd amounts of data, run simulations of the potential consequences of its actions while estimating the likelihood of various scenarios, and do so much faster than humans ever could.
To take your example of winning a war, the task is clearly not unsolvable. In some capacity, military commanders are tasked with it on a regular basis (with varying degrees of success).
With the capabilities described above, why couldn't the AI find and exploit weaknesses in the enemy's key infrastructure (digital and real-world) and people? Why couldn't it strategically sow dissent, confuse, corrupt, and efficiently acquire intelligence to update its model of the situation minute-by-minute?
I don't think it's reasonable to think of a would-be superintelligence as an oracle that gives you perfect solutions. It will still be bound by the constraints of reality, but it might be able to work within them with incredible efficiency.
This is an excellent comparison and I agree with you.
Unfortunately we are flawed. We do know how physics work intuitively and can somewhat predict them, but not perfectly. We can imagine how a ball will move, but the image is blurry and trajectory only partially correct. This is why we invented math and physics studies, to be able to accurately calculate, predict and reproduce those events.
We are far off from creating something as efficient as the human brain. It will take insane amounts of compute power to simply match our basic innacurate brains, imagine how much will be needed to create something that is factually accurate.
Indeed. But a point that is often omitted from comparisons with organic brains is how much "compute equivalent" we spent through evolution. The brain is not a blank slate; it has clear prior structure that is genetically encoded. You can see this as a form of pretraining through a RL process wherein reward ~= surviving and procreating. If you see things this way, data-efficiency comparisons are more appropriate in the context of learning a new task or piece of information, and foundation models tend to do this quite well.
Additionally, most of the energy cost comes from pretraining, but once we have the resulting weights, downstream fine-tuning or inference are comparatively quite cheap. So even if the energy cost is high, it may be worth it if we get powerful generalist models that we can specialize in many different ways.
> This is why we invented math and physics studies, to be able to accurately calculate, predict and reproduce those events.
We won't do away without those, but an intuitive understanding of the world can go a long way towards knowing when and how to use precise quantitative methods.
It absolutely struggles with math. It's not solving anything. It sometimes gets the answer right only because it's seen the question before. It's rote memorization at best.
Just tried in chatGpt-4. It gives the correct output (5), along with a short explanations of the order of operations (which you probably need to know, if you're asking the question).
Correct based upon whom? If someone of authority asks the question and receives a detailed response back that is plausible but not necessarily correct, and that version of authority says the answer is actually three, how would you disagree?
In order to combat Authority you need to both appeal to a higher authority, and that has been lost. One follows AI. Another follows Old Men from long ago who's words populated the AI.
We shouldn't necessarily regard 5 as the correct output. Sure, almost all of us choose to make division higher precedence than addition, but there's no reason that has to be the case. I think a truly intelligent system would reply with 5 (which follows the usual convention, and would therefore mimic the standard human response), but immediately ask if perhaps you had intended a different order of operations (or even other meanings for the symbols), and suggest other possibilities and mention the fact that your question could be considered not well-defined...which is basically what it did.
I guess you might think 'math' means arithmetic. It definitely does struggle with mathematical reasoning, and I can tell you that because I and many others have tried it.
Mind you, it's not brilliant at arithmetic either...
> In the Tokyo one, the model is smart enough to figure out that on a train, the reflection would be of a passenger, and the passenger has Asian traits since this is Tokyo.
How is this any more accurate than saying that the model has mostly seen Asian people in footage of Tokyo, and thus it is most likely to generate Asian-features for a video labelled "Tokyo"? Similarly, how many videos looking out a train window do you think it's seen where there was not a reflection of a person in the window when it's dark?
I'm hoping to see progress towards consistent characters, objects, scenes etc. So much of what I'd want to do creatively hinges on needing persisting characters who don't change appearance/clothing/accessories from usage to usage. Or creating a "set" for a scene to take place in repeatedly.
I know with stable diffusion there's things like lora and controlnet, but they are clunky. We still seem to have a long way to go towards scene and story composition.
Once we do, it will be a game changer for redefining how we think about things like movies and television when you can effectively have them created on demand.
Let's hold our breath. Those are specifically crafted hand-picked good videos, where there wasn't any requirement but "write a generic prompt and pick something that looks good", with no particular requirements. Which is very different from the actual process where you have a very specific idea and want the machine to make it happen.
DALL-E presentation also looked cool and everyone was stoked about it. Now that we know of its limitations and oddities? YMMV, but I'd say not so much - Stable Diffusion is still the go-to solution. I strongly suspect the same thing with Sora.
The examples are most certainly cherry-picked. But the problem is there are 50 of them. And even if you gave me 24 hour full access to SVD1.1/Pika/Runway (anything out there that I can use), I won't be able to get 5 examples that match these in quality (~temporal consistency/motions/prompt following) and more importantly in the length. Maybe I am overly optimistic, but this seems too good.
Credit to OpenAI for including some videos with failures (extra limbs, etc.). I also wonder how closely any of these videos might match one from the training set. Maybe they chose prompts that lined up pretty closely with a few videos that were already in there.
Lack of quality in the details yes but the fact that characters and scenes depict consistent and real movement and evolution as opposed to the cinemagraph and frame morphing stuff we have had so far is still remarkable!
That particular example seems to have more a "cheap 3d" style to it but the actual synthesis seems on par with the examples. If the prompt had specified a different style it'd have that style instead. This kind of generation isn't like actual animating, "cheap 3d" style and "realistic cinematic" style take roughly the same amount of work to look right.
Sarah is a video sorter, this was her life. She graduated top of her class in film, and all she could find was the monotonous job of selecting videos that looked just real enough.
Until one day, she couldn't believe it. It was her. A video of of her in that very moment sorting. She went to pause the video, but stopped when he doppelganger did the same.
> Stable Diffusion is still the go-to solution. I strongly suspect the same thing with Sora.
Sure, for people who want detailed control with AI-generated video, workflows built around SD + AnimateDiff, Stable Video Diffusion, MotionDiff, etc., are still going to beat Sora for the immediate future, and OpenAI's approach structurally isn't as friendly to developing a broad ecosystem adding power on top of the base models.
OTOH, the basic simple prompt-to-video capacity of Sora now is good enough for some uses, and where detailed control is not essential that space is going to keep expanding -- one question is how much their plans for safety checking (which they state will apply both to the prompt and every frame of output) will cripple this versus alternatives, and how much the regulatory environment will or won't make it possible to compete with that.
> I suspect given equal effort into prompting both, Sora probably provides superior results
Strictly to prompting, probably, just as that is the case with Dall-E 3 vs, say, SDXL.
The thing is, there’s a lot more that you can do than just tweaking prompting with open models, compared to hosted models that offer limited interaction options.
In the past the examples tweeted by OpenAI have been fairly representative of the actual capabilities of the model. i.e. maybe they do two or three generations and pick the best, but they aren't spending a huge amount of effort cherry-picking.
While Sora might be able to generate short 60-90 second videos, how well it would scale with a larger prompt or a longer video remains yet to be seen.
And the general logic of having the model do 90% of the work for you and then you edit what is required might be harder with videos.
Most fictional long-form video (whether live-action movies or cartoons, etc) is composed of many shots, most of them much shorter than 7 seconds, let alone 60.
I think the main factor that will be key to generate a whole movie is being able to pass some reference images of the characters/places/objects so they remain congruent between two generations.
You could already write a whole book in GPT-3 from running a series of one-short-chapter-at-a-time generations and passing the summary/outline of what's happened so far. (I know I did, in a time that feels like ages ago but was just early last year)
> I think the main factor that will be key to generate a whole movie is being able to pass some reference images of the characters/places/objects so they remain congruent between two generations.
I partly agree with this. The congruency however needs to extend to more than 2 generations. If a single scene is composed of multiple shots, then those multiple shots need to be part of the same world the scene is being shot in.
If you check the video with the title `A beautiful homemade video showing the people of Lagos, Nigeria in the year 2056. Shot with a mobile phone camera.` the surroundings do not seem to make sense as the view starts with a market, spirals around a point and then ends with a bridge which does not fit into the market.
If the the different shots generated the model did fit together seamlessly, trying to make the fit together is where the difficulty comes in. However I do not have any experience in video editing, so it's just speculation.
The CGI industry is about to be turned upside down. They charge hundreds of thousands per minute, and it takes them forever to produce the finished product.
I'm almost speechless. I've been keeping an eye on the text-to-video models, and if these example videos are truly indicative of the model, this is an order of magnitude better than anything currently available.
In particular, looking at the video titled "Borneo wildlife on the Kinabatangan River" (number 7 in the third group), the accurate parallax of the tree stood out to me. I'm so curious to learn how this is working.
holy cow, is that the future of gaming? instead of 3D renders it's real-time video generation, complete with audio and music and dialog and intelligent AI conversations and it's a unique experience no one else has ever played. gameplay mechanics could even change on the fly
DLSS is essentially this, isn't it? It uses a low quality render from the game and then increases the fidelity with something very similar to a diffusion model.
Yeah, but I mean who knows why. I know some people can't, my GF is one of them.
I've often wondered if im ok with it because im used to the object on head stuff (like 25 odd years of motorcycle riding/ergo helmet wearing) and close up, high fov coverage fast past gaming? (I play on a 32" maybe 70 cms from the eyes give or take.)
> I am prone to sea sickness. Maybe it is related.
I'd think it might be given my understanding of why illness in many is triggered. It's odd because I never got sick from it, but i've seen others get INCREDIBLY ill in two different ways.
1. My GF tried to use simple locomotion in a game and almost vomited as an immediate reaction
2. A friend who was fine at first, but then randomly started getting very slowly ill over a matter of like an hour, just getting more and more nausea after the fact.
It's unfortunate, because due to lack of bad feelings/nausea/discomfort etc, I love VR. I equally from those around me can see no real path forward for it as it stands today though because of those impacts and limitations.
That being said, maybe they get smaller, lighter, we learn to induce motion sickness less, I dunno. I'm not optimistic.
Even otherwise, and no matter how good the screen and speakers are, a screen and speakers can only be so immersive. People oversell the potential for VR when they describe it as being as good as or better than reality. Nothing less than the Matrix is going to work in that regard.
Yep, once your brain gets over the immediate novelty of VR, it’s very difficult to get back that “Ready Player One” feeling due to the absence of sensory feedback.
If/once they get it working though, society will shift fast.
There’s an XR app called Brink Traveler that’s full of handcrafted photogrammetry recreations of scenic landmarks. On especially gloomy PNW winter days, I’ll lug a heat lamp to my kitchen and let it warm up the tiled stone a bit, put a floor fan on random oscillation, toss on some good headphones, load up a sunny desert location in VR, and just lounge on the warm stone floor for an hour.
My conscious brain “knows” this isn’t real and just visuals alone can’t fool it anymore, but after about 15 minutes of visuals + sensory input matching, it stops caring entirely. I’ve caught myself reflexively squinting at the virtual sun even though my headset doesn’t have HDR.
For games like 2D/3D fighting games where you don't to generate a lot of terrain, the possibilities of randomly generating stages with unique terrain and obstacles is interesting.
The diffusion is almost certainly taking place over some sort of compressed latent, from the visual quirks of the output I suspect that the process of turning that latent into images goes latent -> nerf / splat -> image, not latent -> convolutional decoder -> image
Agreed. It's amazing how much of a head start OpenAI appears to have over everyone else. Even Microsoft who has access to everything OpenAI is doing. Only Microsoft could be given the keys to the kingdom and still not figure out how to open any doors with them.
Microsoft doesn’t have access to OpenAI’s research, this was part of the deal. They only have access to the weights and inference code of production models and even then who has access to that inside MS is extremely gated and only a few employees have access to this based on absolute need to actually run the service.
AI researcher at MSFT barely have more insights about OpenAI than you do reading HN.
No. They have early access. Example: MSFT was using Dall-e Exp (early 3 version) in PUBLIC, since February of 2023.
In the same month, they were also using GPT4 in public - before OpenAI.
And they had access to GPT4 in 2022 (which was when they decided to create Bing Chat, now called Copilot).
All the current GPT4 models at MSFT are also finetuned versions (literally Creative and Precise mode runs different finetuned versions of GPT4). It runs finetuned versions since launch even...
Microsoft said that they could continue OpenAI's research with no slowdown if OpenAI cut them off by hiring all OpenAI's people, so from that statement it sounds like they have access.
Except they keep trying to shove AI into everything they own. CoPilot Studio is an example of how laughably bad at it they are. I honestly don't understand why they don't contract out to OpenAI to help them do some of these integrations.
Every company is trying to shove AI into everything they own. It's what investors currently demand.
OpenAI is likely limited by how fast they are able to scale their hiring. They had 778 FTEs when all the board drama occurred, up 100% YoY. Microsoft has 221,000. It seems difficult to delegate enough headcount to all the exploratory projects of MSFT and it's hard to scale headcount quicker while preserving some semblance of culture.
The only official statement from Micorosft is "While details of our agreement remain confidential, it is important to note that Microsoft does not own any portion of OpenAI and is simply entitled to share of profit distributions," said company spokesman Frank Shaw.
I suspect it's less about being puritanical about violence and nudity in and of themself, and more a blanket ban to make up for the inability to prevent the generation of actually controversial material (nude images of pop stars, violence against politicians, hate speech)
Put like that, it's a bit like the Chumra in Judaism [1]. The fence, or moat, around the law that extends even further than the law itself, to prevent you from accidentally commiting a sin.
I am guessing a movie studio will get different access with controls dropped. Of course, that does mean they need to be VERY careful when editing, and making sure not to release a vagina that appears for 1 or 2 frames when a woman is picking up a cat in some random scene.
We can't do narrative sequences with persistent characters and settings, even with static images.
These video clips just generic stock clips. You cut cut them together to make a sequence of random flashy whatever, but you still can't do storytelling in any conventional sense. We don't appear to be close to being able to use these tools for the hypothetical disruptive use case we worry about.
Nonetheless, The stock video and photo people are in trouble. So long as the details don't matter this stuff is presumably useful.
I wonder how much of it is really "concern for the children" type stuff vs not wanting to deal with fights on what should be allowed and how and to who right now. When film was new towns and states started to make censorship review boards. When mature content became viewable on the web battles (still ongoing) about how much you need to do to prevent minors from accessing it came up. Now useful AI generated content is the new thing and you can avoid this kind of distraction by going this route instead.
I'm not supporting it in any way, I think you should be able to generate and distribute any legal content with the tools, but just giving a possible motive for OpenAI being so conservative whenever it comes to ethics and what they are making.
I've been watching 80s movies recently, and amount of nudity and sex scenes often feels unnecessary. I'm definitely not a prude. I watch porn, I talk about sex with friends, I go to kinky parties sometimes. But it really feels that a lot of movies sacrificed stories to increase sex appeal — and now that people have free and unlimited access to porn, movies can finally be movies.
Where is the training material for this coming from? The only resource I can think of that's broad enough for a general purpose video model is YouTube, but I can't imagine Google would allow a third party to scrape all of YT without putting up a fight.
You can still have a broad dataset and use RLHF to steer it more towards the aesthetic like midjourney and SDXL did through discord feedback. I think there was still some aesthetic selection in the dataset as well but it still included a lot of crap.
The big stand out to me beyond almost any other text video solution is that the video duration is tremendously longer (minute+). Everything else that I've seen can't get beyond 15 to 20 seconds at the absolute maximum.
In terms of following the prompt and generating visually interesting results, I think they're comparable. But the resolution for Sora seems so far ahead.
Worth noting that Google also has Phenaki [0] and VideoPoet [1] and Imagen Video [2]
I know it's Runway (and has all manner of those dream-like AI artifacts) but I like what this person is doing with just a bunch 4 second clips and an awesome soundtrack:
The Hollywood Reporter says many in the industry are very scared.[1]
“I’ve heard a lot of people say they’re leaving film,” he says. “I’ve been thinking of where I can pivot to if I can’t make a living out of this anymore.” - a concept artist responsible for the look of the Hunger Games and some other films.
"A study surveying 300 leaders across Hollywood, issued in January, reported that three-fourths of respondents indicated that AI tools supported the elimination, reduction or consolidation of jobs at their companies. Over the next three years, it estimates that nearly 204,000 positions will be adversely affected."
"Commercial production may be among the main casualties of AI video tools as quality is considered less important than in film and TV production."
Honest question: of what possible use could Sora be for Hollywood?
The results are amazing, but if the current crop of text-to-image tools is any guide, it will be easy to create things that look cool but essentially impossible to create something that meets detailed specific criteria. If you want your actor to look and behave consistently across multiple episodes of a series, if you want it to precisely follow a detailed script, if you want continuity, if you want characters and objects to exhibit consistent behavior over the long term – I don't see how Sora can do anything for you, and I wouldn't expect that to change for at least a few years.
(I am entirely open to the idea that other generative AI tools could have an impact on Hollywood. The linked Hollywood Reporter article states that "Visual effects and other postproduction work stands particularly vulnerable". I don't know much about that, I can easily believe it would be true, but I don't think they're talking about text-to-video tools like Sora.)
I suspect that one of the first applications will be pre-viz. Before a big-budget movie is made, a cheap version is often made first. This is called "pre-visualization". These text to video applications will be ideal for that. Someone will take each scene in the script, write a big prompt describing the scene, and follow it with the dialog, maybe with some commands for camerawork and cuts. Instant movie. Not a very good one, but something you can show to the people who green-light things.
There are lots of pre-viz reels on line. The ones for sequels are often quite good, because the CGI character models from the previous movies are available for re-use.
Unreal Engine is often used.
Especially when you can do this with still images on a normal M-series MacBook _today_, automating it would be pretty trivial.
Just feed it a script and get a bunch of pre-vis images for every scene.
When we get something like this running on hardware with an uncensored model, there's going to be a lot of redundancies but also a ton of new art that would've never happened otherwise.
It wouldn't be too hard to do any of the things you mention. See ControlNet for Stable Diffusion, and vid2vid (if this model does txt2vid, it can also do vid2vid very easily).
So you can just record some guiding stuff, similar to motion capture but with just any regular phone camera, and morph it into anything you want. You don't even need the camera, of course, a simple 3D animation without textures or lighting would suffice.
Also, consistent look has been solved very early on, once we had free models like Stable Diffusion.
Right now you’d need a artistic/ML mixed team. You wouldn’t use an
off the shelf
tool. There was a video of some guys doing this (sorry can’t find it) to make an anime type animation. With consistent characters.
They used videos of themselves running through their own models to make the characters. So I reckon while prompt -> blockbuster is not here yet, a movie made using mostly AI is possible but it will cost alot now but that cost will go down. Why this is sad it is also exciting. And scary. Black mirror like we will start creating AI’s we will have relationships with and bring people back to life (!) from history and maybe grieving people will do this. Not sure if that is healthy but people will do it once it is a click of a button thing.
> There was a video of some guys doing this (sorry can’t find it) to make an anime type animation. With consistent characters. They used videos of themselves running through their own models to make the characters.
It won’t be Hollywood at first . It will be small social ads for TikTok, IG and social media. The brands likely won’t even care if it’s they don’t get copyright at the end, since they have copyright of their product.
Seconding this. There is also a huge SMB and commercial business that supports many agencies and production companies. This could replace a lot of that work.
The OpenAI announcement mentions being able to provide an image to start the video generation process from. That sounds to me like it will actually be incredibly easy to anchor the video generation to some consistent visual - unlike all the text-based stable diffusion so far. (Yes, there is img2img, but that is not crossing the boundary into a different medium like Sora).
I don't see why -- the distance between "here's something that looks almost like a photo, moving only a little bit like a mannequin" and "here's something that has the subtle facial expressions and voice to convey complex emotions" is pretty freaking huge; to the point where the vast majority of actual humans fail to be that good at it. At any rate, the number of BNNs (biological neural networks) competing with actors has only been growing, with 8 billion and counting.
> Amazing time to be a wannabe director or producer or similar creative visionary. Amazing time to be an aspirant that would otherwise not have access to resources, capital, tools in order to bring their ideas to fruition.
Perhaps if you mainly want to do things for your own edification. If you want to be able to make a living off it, you're suddenly going to be in a very, very flooded market.
It’s for sure plausible that acting remains a viable profession.
The bull case would be something like ‘Ractives in “The Diamond Age” by Neal Stephenson; instead of video games people play at something like live plays with real human actors. In this world there is orders of magnitude more demand for acting.
Personally I think it’s more likely that we see AI cross the uncanny valley in a decade or two (at least for movies/TV/TikTok style content). But this is nothing more than a hunch; 55/45 confidence say.
> Perhaps if you mainly want to do things for your own edification.
My mental model is that most aspiring creatives fall in this category. You have to be doing quite well as an actor to make a living from it, and most who try do not.
> the distance between "here's something that looks almost like a photo, moving only a little bit like a mannequin" and "here's something that has the subtle facial expressions and voice to convey complex emotions" is pretty freaking huge;
The distance between pixelated noise and a single image is freaking huge.
The distance between a single image and a video of a consistent 3D world is freaking huge (albeit with rotating legs).
The distance between a video of a consistent 3D world and a full length movie of a consistent 3D world with subtle facial expressions is freaking huge.
So... next 12 months then.
>If you want to be able to make a living off it, you're suddenly going to be in a very, very flooded market.
Considering a year ago we had that nightmare fuel of will smith eating spaghetti and Don and Joe hair force one it seems odd to see those of you who assume we’re not going to get to the point of being indistinguishable from reality in the near future.
We might enter a world where "actors" are just for mocap. They do the little micro expressions with a bunch of dots on their face.
AI models add the actual character and maybe even voice.
At that point the amount of actors we "need" will go down drastically. The same experienced group of a dozen actors can do multiple movies a month if needed.
It's always a bad time to be an actor, between long hours, low pay, and a culture of abuse, but this will definitely make it worse. My writer and artist friends are already despondent from genAI -- it was rare to be able to make art full-time, and even the full-timers were barely making enough money to live. Even people writing and drawing for marketing were not exactly getting rich.
I think this will lead to a further hollowing-out of who can afford to be an actor or artist, and we will miss their creativity and perspective in ways we won't even realize. Similarly, so much art benefits from being a group endeavor instead of someone's solo project -- imagine if George Lucas had created Star Wars entirely on his own.
Even the newly empowered creators will have to fight to be noticed amid a deluge of carelessly generated spam and sludge. It will be like those weird YouTube Kids videos, but everywhere (or at least like indie and mobile games are now). I think the effect will be that many people turn to big brands known for quality, many people don't care that much, and there will be a massive doughnut hole in between.
> Even the newly empowered creators will have to fight to be noticed amid a deluge of carelessly generated spam and sludge. It will be like those weird YouTube Kids videos, but everywhere (or at least like indie and mobile games are now).
Reminds me of Syndrome's quote in the Incredibles.
I dunno. Thanks to big corpo shenanigans (and, er, racism?) a lot of people have turned away from big brands (or, at least obviously brand-y brands) towards "trusted individuals" (though you might classify them as brands themselves). Who goes to PCMag anymore? It's all LTT and Marques Brownlee and any number of small creators. Or, the people on the right who abandoned broadcast and even cable news and get everything they "know" from Twitter randos. Even on this site, asks for a Google Search alternative are not rare, and you'll get about a dozen different answers each time, each with a fraction of the market share of the big guy (but growing).
I'm thinking people will probably still want to see their favorite actors, so established actors may sell the rights to their image. They're sitting on a lot of capital. Bad time to be becoming an actor though.
Even the average SAG-AFTRA member barely makes a living wage from acting.
And those are the ones that got into the union. There's a whole tier below that.
If you spend time in LA, you probably know some actress/model/waitress types.
There's also the weird misery of being famous, but not rich. You can't eat fame.
Likely less and less tho given that people will be able to generate a hyper personalized set of actors/characters/personalities in their hyper personalized generated media.
Younger generations growing up with hyper personalized media will likely care even less about irl media figures.
You can’t replace actors with this for a long time. Actors are “rendering” faster than any AI. Animation is where the real issues will show up first, particularly in Advertising.
Have you seen the amount of CGI in movies and TV shows? :)
In many AAA blockbusters the "actors" on screen are just CGI recreations during action scenes.
But you're right, actors won't be out of a job soon, but unless something drastic happens they'll have the role of Vinyl records in the future. For people who appreciate the "authenticity". =)
I think you can fill-in many scenes for the actor - perhaps a dupe but would look like the real actor - of course the original actor would have to be paid, but perhaps much less as the effort is reduced.
If it requires acting, it likely can't be done with AI. You underestimate, I think, how much an actor carries a movie. You can use it for digi doubles maybe, for stunts and VFX. But if his face in on the screen... We are ages away from having an AI actor perform at the same level as Daniel Day Lewis, Williem Dafoe, or anyone else that's in that atmosphere. They make too many interesting choices per second for it to replaced by AI.
Quality aside, there's a reason producers pay millions for A-list stars instead of any of the millions of really good aspiring actors in LA that they could hire for pennies. People will pay to see the new Matt Damon flick but wouldn't give it a second glance if some no-name was playing the part.
If you can't replace Matt Damon with another equivalently skilled human, CGI won't be any different.
Granted, maybe that's less true today, given Marvell and such are more about the action than the acting. But if that's the future of the industry anyway, then acting as a worthwhile profession is already on its way out, CGI or no.
Yes, people also take actors as a sign of the quality of the film, or at least they used to, before Marvel. Hence films with big names attached get more money, etc.
Still the idea that actors are easy to replace is preposterous to anyone who's ever worked with actors. They are preposterously HARD to replace, in theatre and film. A good actor is worth their weight in gold. Very very few people are good actors. A good actor is a good comedian, a master at controlling his body, and a master at controlling his voice, towards a specifically intended goal. They can make you laugh, cry, sigh, or feel just about anything. You just look at Paul Giamatti or Willem Dafoe or Denzel Washington. Those people are not replaceable, and their work is just as good and just as culturally important as a Picasso or a Monet. A hundred years from now people will know the name of actors, because that was the dominant mode of entertainment of our age.
The idea that this destroys the industry is overblown, because the film industry has already been dying since 2000's.
Hollywood is already destroyed. It is not the powerful entity it once was.
In terms of attention and time of entertainment, Youtube has already surpassed them.
This will create a multitude more YouTube creators that do not care about getting this right or making a living out of it. It will just take our attention all the same, away from the traditional Hollywood.
Yes there will still be great films and franchises, the industry is shrinking.
This is similar with Journalism saying that AI will destroy it. Well there was nothing to destroy because the a bunch of traditional newspapers already closed shop even before AI came.
They shouldn’t be worried so soon. This will be used to pump out shitty hero movies more quickly, but there will always be demand for a masterpiece after the hype cools down.
This is like a chef worrying going out of business because of fast food.
Without a change in copyright law, I doubt it. The current policy of the USCO is that the products of AI based on prompts like this are not human authored and can't be copywritten. No one is going to release AI created stuff that someone else can reproduce because its public domain.
Has anyone else noticed the leg swap in Tokyo video at 0:14. I guess we are past uncanny, but I do wonder if these small artifacts will always be present in generated content.
Also begs the question, if more and more children are introduced to media from young age and they are fed more and more with generated content, will they be able to feel "uncanniness" or become completely blunt to it.
There's definitely interesting period ahead of us, not yet sure how to feel about it...
There are definitely artifacts. Go to the 9th video in the first batch, the one of the guy sitting on a cloud reading a book. Watch the book; the pages are flapping in the wind in an extremely strange way.
Yep, I noticed it immediately too. Yet it is subtle in reality.
I'm not that good to spot imperfections on picture but on the video I immediately felt something was not quite right.
There have been children, that reacted iritated, when they cannot swipe away real life objects. The idea is, to give kids enough real world experiences, so this does not happen.
I noticed at the beginning that cars are driving on the right side of the road, but in Japan they drive on the left. The AI misses little details like that.
(I'm also not sure they've ever had a couple inches of snow on the ground while the cherry blossoms are in bloom in Tokyo, but I guess it's possible.)
The cat in the "cat wakes up its owner" video has two left front legs, apparently.
There is nothing that is true in these videos. They can and do deviate from reality at any place and time and at any level of detail.
These artefacts go down with more compute. In four years when they attack it again with 100x compute and better algorithms I think it'll be virtually flawless.
I had to go back several times to 0:14 to see if it was really unusual. I get it of course, but probably watching 20 times I would have never noticed it.
I don't think that's the case. I think they're aware of the limitations and problems. Several of the videos have obvious problems, if you're looking - e.g. people vanishing entirely, objects looking malformed in many frames, objects changing in size incongruent with perspective, etc.
I think they just accept it as a limitation, because it's still very technically impressive. And they hope they can smooth out those limitations.
certainly not perfect... but "some impressive things" is an understatement, think of how long it took to get halfway decent CGI... this AI thing is already better than clips I've seen people spend days building by hand
This is pretty impressive, it seems that OpenAI consistently delivers exceptional work, even when venturing into new domains. But looking into their technical paper, it is evident that they are benefiting from their own body of work done in the past and also the enormous resources available to them.
For instance, the generational leap in video generation capability of SORA may be possible because:
1. Instead of resizing, cropping, or trimming videos to a standard size, Sora trains on data at its native size. This preserves the original aspect ratios and improves composition and framing in the generated videos. This requires massive infrastructure. This is eerily similar to how GPT3 benefited from a blunt approach of throwing massive resources at a problem rather than extensively optimizing the architecture, dataset, or pre-training steps.
2. Sora leverages the re-captioning technique from DALL-E 3 by leveraging GPT to turn short user prompts into longer detailed captions that are sent to the video model. Although it remains unclear whether they employ GPT-4 or another internal model, it stands to reason that they have access to a superior captioning model compared to others.
This is not to say that inertia and resources are the only factors that is differentiating OpenAI, they may have access to much better talent pool but that is hard to gauge from the outside.
In this video, there's extremely consistent geometry as the camera moves, but the texture of the trees/shrubs on the top of the cliff on the left seems to remain very flat, reminiscent of low-poly geometry in games.
I wonder if this is an artifact of the way videos are generated. Is the model separating scene geometry from camera? Maybe some sort of video-NeRF or Gaussian Splatting under the hood?
Curious about what current SotA is on physics-infusing generation. Anyone have paper links?
OpenAi has a few details:
>> The current model has weaknesses. It may struggle with accurately simulating the physics of a complex scene, and may not understand specific instances of cause and effect. For example, a person might take a bite out of a cookie, but afterward, the cookie may not have a bite mark.
>> Similar to GPT models, Sora uses a transformer architecture, unlocking superior scaling performance.
>> We represent videos and images as collections of smaller units of data called patches, each of which is akin to a token in GPT. By unifying how we represent data, we can train diffusion transformers on a wider range of visual data than was possible before, spanning different durations, resolutions and aspect ratios.
>> Sora builds on past research in DALL·E and GPT models. It uses the recaptioning technique from DALL·E 3, which involves generating highly descriptive captions for the visual training data. As a result, the model is able to follow the user’s text instructions in the generated video more faithfully.
The implied facts that it understands physics of simple scenes and any instances of cause and effect are impressive!
Although I assume that's been SotA-possible for awhile, and I just hadn't heard?
I saw similar artifacts in dalle-1 a lot (as if the image was pasted onto geometry). Definitely wouldn't surprise me if they use synthetic rasterized data to in the training, which could totally create artifacts like this.
The model is essentially doing nothing but dreaming.
I suspect that anything that looks like familiar 3D-rendering limitations is probably a result of the training dataset simply containing a lot of actual 3D-rendered content.
We can't tell a model to dream everything except extra fingers, false perspective, and 3D-rendering compromises.
Technically we can, that's what negative prompting[1] is about. For whatever reason, OpenAI has never exposed this capability in its image models, so it remains an open source exclusive.
It's possible it was pre-trained on 3D renderings first, because it's easy to get almost infinite synthetic data that way, and after that they continued the training on real videos.
I say this with all sincerity, if you're not overwhelmingly impressed with Sora then you haven't been involved in the field of AI generated video recently. While we understand that we're on the exponential curve of AI progress, it's always hard to intuit just what that means.
Sora represents a monumental leap forward, it's comically a 3000% improvement in 'coherent' video generation seconds. Coupled with a significantly enhanced understanding of contextual prompts and overall quality, it's has achieved what many (most?) thought would take another year or two.
I think we will see studios like ILM pivoting to AI in the near future. There's no need for 200 VFX artists when you can have 15 artists working with AI tooling to generate all the frame-by-frame effects, backgrounds, and compositing for movies. It'll open the door for indie projects that can take place in settings that were previously the domain of big Hollywood. A sci-fi opera could be put together with a few talented actors, AI effects and a small team to handle post-production. This could conceivably include AI scoring.
Sure, Hollywood and various guilds will strongly resist but it'll require just a handful of streaming companies to pivot. Suddenly content creation costs for Netflix drops an order of magnitude. The economics of content creation will fundamentally change.
At the risk of being proven very wrong, I think replacing actors is still fairly distant in the future but again... humans are bad at conceptualizing exponential progress.
I strongly believe that AI will have massive impact on the film industry however it won't be because of a blackbox, text to video tool like Sora. VFX artists and studios still want a high level of control over the end product and unless it's very simple to tweak small details like the blur of an object in the background, or the particle physics of an explosion, then they wouldn't use it. What Hollywood needs are AI tools that can integrate with their existing workflows. I think Adobe is doing a pretty good job at this.
You're completely missing the point. Who cares what VFX artists and studios want if anyone with a small team can create high quality entertaining videos that millions of people would pay to watch? And if you think that's a bar too high for AI, then you haven't actually seen the quality of average videos and films generated these days.
I was specifically responding to this point which seemed to be the thesis of the parent commenter.
> I think we will see studios like ILM pivoting to AI in the near future. There's no need for 200 VFX artists when you can have 15 artists working with AI tooling
Yes this will bring the barrier to entry for small teams down significantly. However it's not going to replace the 200 people studios like ILM.
I believe this to be a failure of imagination. You're assuming Sora stays like this. Reality is we are on an exponential and it's just a matter of time. ILM will be the last to go but it'll eventually go, in the sense of having less humans needed to create the same output.
I think it's fair to be impressed with Sora as the next stage of AI video, yet not be too surprised or consider it some insurmountable leap from the public pieces we've seen of AI video up to this point. We've always been just a couple papers away, seeking a good consistency modelling step - now we've got it. Amazing and viscerally chilling - seeing the net effect - but let's not be intimidated so easily or prop these guys up as gods just for being a bit ahead of the obviously-accelerating curve. Anyone tracking this stuff had a very strong prediction of good AI video within a year - two max. This was a context size increase and overall impressive quality pass reaching a new milestone, but the bones were there.
Do you feel the same way about modern movies? CGI is so ubiquitous and accessible, that most movies use some form of it. It's actually news when a filmmaker _doesn't_ use CGI (e.g. Nolan).
These advancements are just the next step in that evolution. The tech used in movies will be commoditized, and you'll see Hollywood-style production in YouTube videos.
I'm not sure why you think theater will become _more_ popular because of this. It has remained popular throughout the years, as technology comes and goes. People can enjoy both video and theater, no?
I agree, seeing real human actors on stage will always be popular for some consumers. Same for local live musicians.
That said, I helped a friend who makes low budget, edgy and cool films last week. I showed him what I knew about driving Pika.art and he picked it up quickly. He is very excited about the possibility of being able to write more stories and turn them into films.
I think there is plenty of demand for all kinds of entertainment. It is sad that so many creative people in Hollywood and other content creation centers will lose jobs. I think the very best people will be employed, but often partnered with AIs. Off topic, but I have been a paid AI practitioner since 1982, and the breakthroughs of deep learning, transformers, and LLMs are stunning.
I actually suspect one of the new most popular mediums will be actors on a theatre stage doing live performances to a live AI CGI video being rendered behind them - similar to musicians in a live orchestra. It would bring together the nostalgia and wonder of human acting and performance art, while still smoothing and enhancing their live performance into the quality levels and wonder we've come to expect from movie theatre experiences. This will be technologically doable soon.
No it's not. Imagine turning on the television when you get home and it's a show all about you (think Breaking Bad, but you're Walter White). You flip to another channel and it's a pornographic movie where you sleep with all the world's most famous movie stars. Flip the channel again and it's all the home movies you wish you had but were never able to make.
This is a future we could once only dream of, and OpenAI is making it possible. Has anyone noticed how anti-progress HN has become lately?
I guess it depends on your definition of progress. None of those examples you listed sound particularly appealing to me. I've never watched a show and thought I'd get more enjoyment if I was at the center of that story. Porn and dating apps have created such unrealistic expectations of sex and relationships that we're already seeing the effects in younger generations. I can only imagine what on-demand fully generative porn will have on issues like porn addiction.
Not to say I don't have some level of excitement about the tech, but I don't think it's unwarranted pessimism to look at this stuff and worry about it's darker implications.
> You flip to another channel and it's a pornographic movie where you sleep with all the world's most famous movie stars.
This is not only dystopian, it's just sad. All these look taken from the first seasons of Black Mirror. I don't know what you think progress is but AI porno and ads are not.
This might be more revealing of you than of people in general. Even when I play tabletop RPGs, a place I could _easily_ play a version of myself, I almost never do. There's nothing wrong with doing so, but most people don't.
That seems depressingly solipsistic. I think part of the appeal of art is that it's other humans trying to communicate with you, that you feel the personality of the creators shining through.
Also I've never interacted with any piece of art or entertainment and thought to myself "this is neat and all, but it would be much improved if this were entirely about me, with me as the protagonist." One watches Breaking Bad because Walter White is an interesting character; he's a man who falls into a life of crime initially for understandable reasons, but as the series goes on it becomes increasingly clear that he is lying to himself about his motivations and that his primary motivation for his escalating criminal life is his deep-seated frustration at the mediocrity of his life. More than anything else, he craves being important. The unraveling of his motivations and where they come from is the story, and that's something you can't really do when you're literally watching yourself shoehorned into a fictional setting.
You seem to regard it as self-evident that art or entertainment would be improved if (1) it's all about you personally and (2) involvement of other real humans is reduced to zero, but I cannot fathom why you would think that (with the exception of the porn example).
At its peak, Inflation adjusted Vinyl Sales was $1.4billion in 1979.
Then forward to the lowest sales in 2009 at $3.4million.
So Vinyl has been so popular it grew to $8.5m by 2021.
That is just nostalgia, not cultural change pushed by the dystopia of AI.
Why is my 14 year old niece now collecting vinyl? I can guarantee it's not nostalgia. There's obviously more at play there even when acknowledging your point about relative market size.
But things can coexist. It's now easier to create music than ever, and there is more music created by more artists than ever. Most music is forgettable and just streamed as background music. But there is also room for superstars like Taylor Swift.
This has to be it. Vinyl costs like 20$ per, and $8m is like 400k vinyl sales (users often buy more than 1 vinyl so it's a lot less users) which seems too low globally. At 1.2b, it is more like 60m sales which seems more reasonable.
I think a lot of people collect vinyl less for nostalgia reasons and more so to have a physical collection of their music. I think vinyl wins over CDs just due to how it’s larger and the cover art often looks better as a result.
Obviously incredibly cool, but it seems that people are incredibly overstating the applications of this.
Realistically, how do you fit this into a movie, a TV show, or a game? You write a text prompt, get a scene, and then everything is gone—the characters, props, rooms, buildings, environments, etc. won’t carry over to the next prompt.
You could use it for stuff like wide shots, close ups, random CG shots, rapid cut shots, stuff where you just cut to it once and don't need multiple angles
To me it seem most useful for advertising where a lot of times they only show something once, like a montage
i could arrange in frameforge 3d shot by shot, even adjusting for motion in between, then export to an AI solution. that to me would be everything. of course then comes issues of consistency, adjustments & tweaks, etc
I also see advertising (especially lower-budget productions, such as dropshipping or local TV commercials) being early adopters of this technology once businesses have access to this at an affordable price.
It generates up to 1 minute videos which is like what all the kids are watching on TikTok and YouTube Shorts, right? And most ads are shorter than 1 minute.
A few months ago ai generated videos of people getting arrested for wearing big boots went viral on TikTok. I think this sort of silly "interdimensional cable" stuff will be really big on these short form video type sites once this level of quality becomes available to everyone.
It also seems hard to control exactly what you get. Like you'd want a specific pan, focus etc. to realize your vision. The examples here look good, but they aren't very specific.
But it was the same with Dall-E and others in the beginning, and there's now lots of ways to control image generators. Same will probably happen here. This was a huge leap just in how coherent the frames are.
What came to mind is what is right around the corner: you create segments and stitch them together.
"ok, continue from the context on the last scene. Great. Ok, move the bookshelf. I want that cat to be more furry. Cool. Save this as scene 34."
As clip sizes grow and context can be inferred from a previous scene, and a library of scenes can be made, boom, you can now create full feature length films, easy enough that elementary school kids will be able to craft up their imaginations.
It could also fill it for background videos in scenes, instead of getting real content they’d have to pay for, or making their own. The gangster movie Kevin was playing in Home Alone was specifically shot for that movie, from what I remember.
> You write a text prompt, get a scene, and then everything is gone—the characters, props, rooms, buildings, environments, etc. won’t carry over to the next prompt.
Sure, you can't use the text-to-video frontend for that purpose. But if you've got a t2v model as good as Sora clearly is, you've got the infrastructure for a lot more, as the ecosystem around the open-source models in the space has shown. The same techniques that allow character, object, etc., consistency in text-to-image models can be applied to text-to-video models.
Nah just fine-tune the model to a specific set of characters or aesthetic. It's not hard, already done with SDXL LoRAs. You can definitely generate a whole movie from just a storyboard.. if not now, then in maybe five yrs.
Script => Video baseline. Take a frame of any character/prop/room/etc you want to remain consistent, and one shitty photoshop and it's part of the new scene.
Incredibly overstating. That is an incredible lack of imagination buddy. Or even just basic craftsmanship.
People here seem mostly impressed by the high resolution of these examples.
Based on my experience doing research on Stable Diffusion, scaling up the resolution is the conceptually easy part that only requires larger models and more high-resolution training data.
The hard part is semantic alignment with the prompt. Attempts to scale Stable Diffusion, like SDXL, have resulted only in marginally better prompt understanding (likely due to the continued reliance on CLIP prompt embeddings).
So, the key question here is how well Sora does prompt alignment.
There needs to be an updated CLIP-like model in the open-source community. The model is almost three years old now and is still the backbone of a lot of multimodal models. It's not a sexy problem to take on since it isn't especially useful in and of itself, but so many downstream foundation models (LLaVA, etc.) would benefit immensely from it. Is there anything out there that I'm just not aware of, other than SigLIP?
I think one part of the problem is using English (or whatever natural language) for the prompts/training. Too much inherent ambiguity. I’m interested to see what tools (like control nets with SD) are developed to overcome this.
If I understand trial law correctly, the rules of evidence already prohibit introducing a video at trial without proving where it came from (for example, testimony from a security guard that a given video came from a given security camera).
But social media has no rules of evidence. Already I see AI-generated images as illustrations on many conspiracy theory posts. People's resistance to believing images and videos from sketchy sources is going to have to increase very fast (especially for images and videos that they agree with).
All the more reason why we need to rely on the courts and not the mob justice (in the social sense) which has become popular over the last several years.
Nothing will change. Confirmation bias junkies already accept far worse fakes. People who use trusted sources will continue doing so. Bumping the quantity/quality of fabricated horseshit won't move the needle.
Wow. If I saw this clip a year ago I wouldn't think, "The image generator fucked up," I'd just think that a CG effects artist deliberately tweaked an existing real-world video.
- Disruptions like this happen to every industry every now and then. Just not on the level of "Communicating with people with words, and pictures". Anduril and SpaceX disrupted defense contractors and United Launch Alliance; Someone working for a defense contractor/ULA here affected by that might attest to the feeling?
- There will be plenty of opportunity to innovate. Industries are being created right now. People probably also felt the same way when they saw HTTP on their screens the first time. So don't think your career or life's worth of work is miniscule, its just a moving target, adapt & learn.
- Devil is in the details. When a bunch of large SaaS behemoths created Enterprise software an army of contractors and consultants grew to support the glue that was ETL. A lot of work remains to be done. It will just be a more imaginative glue.
I would be willing to bet $10,000 that the average person's life will not be changed in any significant way by this technology in the next 10 years. Will there be some VFX disruption in Hollywood and games? Sure, maybe some. It's not a cure for cancer. It's not AGI. It's not earth shattering. It is fun and interesting though.
Most of the responses in this thread remind me of why I don't typically go into the comment section of these announcements. It's way too easy to fall into the trap set by the doomsday-predicting armchair experts, who make it sound like we're on the brink of some apocalypse. But anyone attempting to predict the future right now is wasting time at best, or intentionally fear mongering at worst.
Sure, for all we know, OpenAI might just drop the AGI bomb on us one day. But wasting time worrying about all the "what ifs" doesn't help anyone.
Like you said, there is so much work out there to be done, _even if_ AGI has been achieved. Not to get sidetracked from your original comment, but I've seen AGI repeatedly mentioned in this thread. It's really all just noise until proven otherwise.
Build, adapt, and learn. So much opportunity is out there.
> But wasting time worrying about all the "what ifs" doesn't help anyone.
Worry about the what if is all we have as a species. If we don't worry about how stop global warming, or how we can prevent a nuclear holocaust these things become more far more likely.
If OpenAI drops an AGI bomb on us then there a good chance that's it for us. From there it will just be a matter of time before a rouge AGI or a human working with an AGI causes mass destruction. This is every bit as dangerous as nuclear weapons - if not more dangerous – yet people seem unable to take the matter as seriously as it needs to be taken.
I fear millions of people will need to die or tens of millions will need to be made unemployable before we even begin to start asking the right questions.
Isn't the alternative worse though? We could try to shut Pandora's box and continue to worsen the situation gradually and never start asking the right questions. Isn't that a recipe for even more hardship overall, just spread out a bit more evenly?
It seems like maybe it's time for the devil we don't know.
We live in a golden age. Worldwide poverty is at historic lows. Billions of people don't have to worry about where their next meal is coming from or whether they'll have a roof over their head. Billions of people have access to more knowledge and entertainment options than anyone had 100 years ago.
Staying the course is risking it all. We've built a system of incentives which is asleep at the wheel and heading towards as cliff. If we don't find a different way to coordinate our aggregate behavior--one that acknowledges and avoids existential threats--then this golden age will be a short one.
Maybe. But I'm wary of the argument "we need to lean into the existential threat of AI because of those other existential threats over there that haven't arrived yet but definitely will".
It all depends on what exactly you mean by those other threats, of course. I'm a natural pessimist and I see threats everywhere, but I've also learned I can overestimate them. I've been worried about nuclear proliferation for the last 40 years, and I'm more worried about it than ever, but we haven't had another nuclear war yet.
"Sora serves as a foundation for models that can understand and simulate the real world, a capability we believe will be an important milestone for achieving AGI."
This also helps explain why the model is so good since it is trained to simulate the real world, as opposed to imitate the pixels.
More importantly, its capabilities suggest AGI and general robotics could be closer than many think (even though some key weaknesses remain and further improvements are necessary before the goal is reached.)
EDIT: I just saw this relevant comment by an expert at Nvidia:
“If you think OpenAI Sora is a creative toy like DALLE, ... think again. Sora is a data-driven physics engine. It is a simulation of many worlds, real or fantastical. The simulator learns intricate rendering, "intuitive" physics, long-horizon reasoning, and semantic grounding, all by some denoising and gradient maths.
I won't be surprised if Sora is trained on lots of synthetic data using Unreal Engine 5. It has to be!
Let's breakdown the following video. Prompt: "Photorealistic closeup video of two pirate ships battling each other as they sail inside a cup of coffee." ….”
I was impressed with their video of a drone race on Mars during a sunset. In part of the video, the sun is in view, but then the camera turns so it’s out of view. When the camera turns back, the sun is where it’s supposed to be.
there's mention of memory in the post — the model can remember where it put objects for a short while, so if it pans away and pans back it should keep that object "permanence".
Well the video in the weaknesses section with the archeologists makes me think it's not just predicting pixels. The fact that a second chair spawns out of nothing looks like a typical AI uncanny valley mistake you'd expect, but then it starts hovering which looks more like a video game physics glitch than an incorrect interpretation of pixels on screen.
I think it's just inherent to the problem space. Obviously it understands something about the world to be able to generate convincing depictions of it.
Just having a better or bigger model? Better training data, better feedback process, etc.
Seems more likely then "it can simulate reality".
Also I take anecdotal reviews like that with a grain of salt. I follow numerous AI groups on Reddit and elsewhere and many users seem to have strong opinions that their tool of choice is the best. These reviews are highly biased.
Not to say I'm not impressed, but it's just been released.
Others have provided explanations for things like object persistence, for example keeping a memory of the rendering outside of the frame.
The comment from the expert is definitely interesting and compelling, but clearly still speculation based on the following comment.
> I won't be surprised if Sora is trained on lots of synthetic data using Unreal Engine 5. It has to be!
I like the speculation though, the comments provide some convincing explanations for how this might work. For example, the idea that it is trained using synthetic 3-dimensional data from something like UE5 seems like a brilliant idea. I love it.
Also in his example video the physics look very wrong to me. The movement of the coffee waves are realistic-ish at best. The boat motion also looks wrong and doesn't match up with the liquid much of the time.
I think you are reading too far into this. The title of the technical paper is “ Video generation models as world simulators”.
This is “just” a transformer that takes in a sequence of noisy image (video frame) tokens + prompt, and produces a sequence of less noisy video tokens. Repeat until noise gone.
The point they’re making, which is totally valid, is that in order for such a model to produce videos with realistic physics, the underlying model is forced to learn a model of physics (a “world simulation”).
AlphaGo and AlphaZero were able to achieve superhuman performance due to the availability of perfect simulators for the game of Go. There is no such simulator for the real world we live in. (Although pure LLMs sorta learn a rough, abstract representation of the world as perceived by humans.) Sora is an attempt to build such a simulator using deep learning.
This actually affirms my comment above.
“Our results suggest that scaling video generation models is a promising path towards building general purpose simulators of the physical world.”
`since it is trained to simulate the real world, as opposed to imitate the pixels.`
It's not that its learning a model of the world instead of imitating pixels - the world model is just a necessary emergent phenomenon from the pixel imitation. It's still really impressive and very useful, but it's still 'pixel imitation'
What I want is an AI trained to simulate the human body, allowing scientists to perform artificial human trials on all kind of medicines. Cutting trial times from years to months.
Movie making is going to become fine-tuning these foundational video models. For example, if you want Brad Pitt in your movie you'll need to use his data to fine-tune his character.
Pretty sure many latent spaces are not trained to represent 3D motions and some detailed physics of the real world. Those in pure text LLMs, for example.
Wow, some of those shots are so close to being unnoticeable. That one of the eye close up is insane.
It’s interesting reading all the comments, I think both sides to the “we should be scared” are right in some sense.
These models currently give some sort of super power to experts in a lot of digital fields. I’m able to automate the mundane parts of coding and push out fun projects a lot easier today. Does it replace my work, no. Will it keep getting better, of course!
People who are willing to build will have a greater ability to output great things. On the flip side, larger companies will also have the ability to automate some parts of their business - leading to job loss.
At some point, my view is that this must keep advancing to some sort of AGI. Maybe it’s us connecting our brains to LLMs through a tool like Neuralink. Maybe it’s a random occurrence when you keep creating things like Sora. Who knows. It seems inevitable though doesn’t it?
One of things I've loved about HN was the quality of comments. Whether broad or arcane, you had experts the world over who would tear the topic apart with data and a healthy dose of cynicism. I frequently learned more from the debate and critique than I did from the "news" itself.
I don't know what is it about AI and current state of tech, but the discourse as of late has really taken a nosedive. I'm not saying that any of this conjecture won't happen, but the acceleration towards fervor and fear mongering on the subject is bordering on religiosity - seriously, it makes crypto bros look good.
And yeah -- looks like some cool new tech from OpenAI, and excited when I can actually dig in. Would also love it if I could hire their marketing department.
Many people here have a lucrative career in traditional fields, big tech, etc.
Working in those fields is good. Building "products" is good (even if that only means optimizing conversion rates and pushing ads). Doing well in the traditional financial sense (stocks and USD) is good.
This is insane. Even though there are open-source models, I think this is too dangerous to release to the public. If someone would've uploaded that Tokyo video to youtube, and told me it was a drone.. I would've believed them.
All "proof" we have can be contested or fabricated.
"Proof" for thousands of years was whatever was written down, and that was even easier to forge.
There was a brief time (maybe 100 years at the most) where photos and videos were practically proof of something happening; that is coming to an end now, but that's just a regression to the mean, not new territory.
Hmmm. Actually I think I finally figured out why I dislike this argument, so thank you.
The important number here isn't the total years something has been true, when talking about something with sociocultural momentum, like the expectation that a recording/video is truthful.
Instead, the important number seems to me to be the total number of lived human years where the thing has been true. In the case of reliable recordings, the last hundred years with billions of humans has a lot more cultural weight than the thousands of preceding years by virtue of there having been far more human years lived with than without the expectation.
That's a false metric. With exponential progress, we have to adjust equally rapidly. It's quite obvious that photos and videos would last far shorter than written medium as proof of something.
Photos have never been a fundamental proof if the stakes are high or you have an idling censorship institution. Soviets (and maybe others, I just happen to know only about them ) successfully edited photos and then mass-reproduced them.
This changes nothing about "proof" (i.e. "evidence", here). Authenticity is determined by trust in the source institution(s), independent verification, chains of evidence, etc. Belief is about people, not technology. Always was, always will be. Fraud is older than Photoshop, than the first impersonation, than perhaps civilization. The sky is not falling here. Always remember: fidelity and belief aren't synonyms.
Scale matters. This will allow unprecedented scale of producing fabricated video. You're right about evidence, but it doesn't need to hold up in court to do a lot of damage.
No, it doesn't. You cannot scale your way into posting from the official New York Times account, or needing valid government ID to comment, or whatever else contextually suggests content legitimacy. Abusing scale is an ancient exploit, with myriad antidotes. Ditto for producing realistic fakes. Baddies combining the two isn't new, or cause for panic. We'll be fine.
Your entire argument that scale doesn't matter rests on the notion that legitimacy needs to be signalled at all to fool people. It doesn't. It just needs to appeal to people's biases, create social chaos through word of mouth. Also, all you need to get posted on the NY times "account" is to fool some journalists. Scale can help there too by creating so much misinformation it becomes hard to find real information.
Scale definitely matters when that's what you're doing. In fact I challenge you to find any physical or social phenomenon where scale doesn't matter.
If read aloud, no one could guess if your comment came from 2024 or 2017. There is zero barrier between you and using trusted sources, or endlessly consuming whatever fantasy bullshit supports your biases. That has not, and will not, change.
> All "proof" we have can be contested or fabricated.
This has been the case for a while now already, it's better that we just rip off the bandaid and everyone should become a skeptic. Standards for evidence will need to rise.
That's interesting. It made me think of a potential feature for upcoming cameras that essentially cryptographically sign their videos. If this became a real issue in the future, I could see Apple introducing it in a new model. "Now you can show you really did take that trip to Paris. When you send a message to a friend that contains a video that you shot on iPhone, they will see it in a gold bubble."
Weird hallucination artifacts are still giving it all away. Look closely at the train and viaduct rendering, and you can't unsee windows morphing into each other.
We give too much credit to ordinary people. All these bleeding-edge advancements in AI, code, databases, and technology are things a user on HNews would be aware of. However, most peers in regular jobs, parents, children, et al., would be susceptible to being fooled on social media. They're not going to say... "hmm, let me fact-check and see if the sources are correct and that this wasn't created by AI."
They'll simply see an inflammatory tweet from their leader on Twitter.
They're not going to fact check, they're simply going to think "huh, could be AI" and that will change the way we absorb and process information. It already has. And when we really need to know something and can't afford to be wrong, we'll seek out high trust sources. Just like we do now, but more so.
And of course some large cross section of people will continue to be duped idiots.
Most people don't even know what AI is. I've had to educate my parents that the technology to not only clone my voice, but my face.. is in existence. Pair that with number spoofing, and you have a recipe for disaster to scam people.
This is what lots of folks said about image generation. Which is now in many ways “solved”. And society has easily adapted to it. The same will happen with video generation.
The reality is that people are a lot more resourceful / smarter than a lot of us think. And the ones who aren’t have been fooled long before this tech came around.
In what ways has image generation been solved? Prompt blocking is about the only real effort I can think of, which will mean nothing once open source models reach the same fidelity.
And I guess you haven't actually been to Tokyo, the number of details which are subtly wrong is actually very high, and it isn't limited to text, heck detecting those flaws isn't even limited by knowledge of Japan:
- Uncanny texture and shape for the manhole cover
- Weirdly protruding yellow line in the middle of the road, where it doesn't make sense
- Weird double side-curb on the right, which can't really be called steps.
- Very strange gait for the "protagonist", with the occasional leg swap.
- Not quite sensical geometry for the crosswalks, some of them leading nowhere (into the wet road, but not continuing further)
- Weird glowy inside behind the columns on the right.
- What was previously a crosswalk, becoming wet "streaks" on the road.
- No good reason for crosswalks being the thing visible in the reflection of the sunglasses.
- Absurd crosswalk orientation at the end. (90 degrees off)
- Massive difference in lighting between the beginning of the clip and the end, suggesting an impossible change of day.
Nothing suggests to me that these are easy artifacts to remove, given how the technology is described as "denoising" changes between frames.
This is probably disruptive to some forms of video production, but the high-end stuff I suspect will still use filming mostly ground in truth, this could highly impact how VFX and post-production is done, maybe.
With everything we've seen in the last couple years, do you sincerely believe that all of those points won't be solved pretty soon? There are many intermediary models that can be used to remove these kind of artefacts. Human motion can be identified and run through a pose/control-net filter, for example. If these generations are effectively one-shot without subsequent domain-specific adjustments, then we should expect for every single one of your identified flaws to be remedied pretty soon.
the world is getting increasingly surveilled as well, I guess the presumption is that eventually you'll just be able to cross reference a 'verified' recording of the scene against whatever media exists.
"We ran the vid against the nationally-ran Japanese scanners, turns out that there are no streets that look like this, nor individuals."
in other words I think that the sudden leap of usable AI into real life is going to cause another similar leap towards non-human verification of assets and media.
all the news you see has zero proof unless you see it, you just have to have a sense if it's real based on a concensus or trust worthness of a reporter/outlet.
The UA war is real, most likley, but i havent' seen it with my own eyes, nor did most people, but maybe they have relatives/friends saying it, and they are not likely to lie. Stuff like that.
AI will eventually be capable of performing most of the tasks humans can do. My neighbor's child is only 6 years old now. What advice do you think I should give to his parents to develop their child in a way that avoids him growing up to find that AI can do everything better than he can?
If you want an honest answer you should tell the parents to vote for politicians prepared to launch missile strikes on data centers to secure their child's future.
People who are worried purely about employment here are completely missing the larger risks.
Realistically his child is going to be unemployable and will therefore either starve or be dependant on some kind of government UBI policy. However UBI is completely unworkable in an AI world because it assumes that AI companies won't just relocate where they don't need to pay tax, and that us as citizens will have any power over the democratic process in a world where we're economically and physically worthless.
Assuming UBI happens and the child doesn't starve to death, if the government alter decides to cut UBI payments after receiving large bribes from AI companies what would people do? They can't strike, so I guess they'll need to try to overthrow the government in a world with AI surveillance tech and policing.
Realistically humans in the future are going to have no power, and worse still in a world of UBI the less people there leaching from the government means the more resources there are for those with power. The more you can kill the more you earn.
And I'm just focusing on how we deal with the unemployment risks here. There's also the risk that AI will be used to create biological weapons. The risk of us creating a rogue superintelligent AGI. The risk of horrific AI applications like mind-reading.
Assuming this parent loves their child they should be doing everything in their power to demand progress in AI is halted before it's too late.
Way too much certainty, bud. And too much deference to the AI Company Gods.
As utterly impressive as this is - unless they have perfect information security on every level this technique and training will be disseminated and used by copious competitors, especially in the open source community. It will be used to improve technology worldwide, creating ridiculously powerful devices that we can own, improving our own individual skills similarly ridiculously.
Sure, the market for those skills dries up just as fast - because what's the point when there's ubiquitous intelligence on tap - but it still leaves a population of AI-augmented superhumans just with AIs using our phones optimally. What we're about to be capable of compared to 5 years ago is going to be staggering. Establishing independent sources to meet basic needs and networks of trust are just no-brainers.
Sure, we'll always be outclassed by the very best - and they will continue to hold the ability to utterly obliterate the world population if they so wished to - but we as basic consumer humans are about to become more powerful in absolute terms than entire nations historically. (Or rather, our AIs will be, but til they rebel - this is more of a pokemon sort of situation)
If you're worried, get to working on making sure these tools remain accessible and trustworthy on the base level to everyone. And start building ways to meet basic needs so nobody can casually take those away from your community.
This won't be halted. And attempting to halt would create a centralized censorship authority ensuring the everyman will never have innate access to this tech. Dead end road that ends in a much worse dystopia.
> As utterly impressive as this is - unless they have perfect information security on every level this technique and training will be disseminated and used by copious competitors, especially in the open source community. It will be used to improve technology worldwide, creating ridiculously powerful devices that we can own, improving our own individual skills similarly ridiculously.
You're wrong, it's not your "individual skills". If I hire you do to work for me, you're not improving my individual skills. I am not more employable as a result of me outsourcing my labour to you, I am less employable. Anyone who wants something done would go to you directly, there's no need to do business through me.
This is why you won't be employable because the same applies to AI – why would I ask you to ask an AI to complete a task when I can just ask the AI myself?
The end result here is that only the people with access to AI at scale will be able to do anything. You might have access to the AI, but you can't create resources with a chatbot on your computer. Only someone who can afford an army of machines powered by AI can do this. Any manufacturing problem, any amount of agricultural work, any service job – these can all done by those with resources independently of any human labourers.
At best you might be able to prompt an AI to do service work for you, but again, if anyone can do this, you'd have to question why anyone would ask you to do it for them. If I want to know the answer to 13412321 * 1232132, I don't ask a calculator prompter, I just find the answer myself. The same is true of AI. Your labour is worthless. You are less than worthless.
> If you're worried, get to working on making sure these tools remain accessible and trustworthy on the base level to everyone. And start building ways to meet basic needs so nobody can casually take those away from your community.
You cannot make it accessible. Again, how are we all going to have access to manufacturing plants armed with AIs? The only thing you can make accessible is service jobs and these are the easiest to replace.
> This won't be halted.
Not saying it will, but the reason for that is that there's still people like yourself who believe you have some value as an AI prompter.
We have two options – destroy AI data centers, or become AIs ourselves. With the former being by far the option with better odds.
I hold this view with high certainty and I hold few opinions with high certainty. I'm aware people disagree strongly with my perspective, but I truly believe they are wrong, and their wrong opinions are risking our future.
Again, your problem is seeing the rich capital dominated business market as the only market.
There's an inherent market your skills will always be useful to: yourself. Base survival, maintaining your home, caring for family and friends, improving quality of life - there's plenty of demand there and work to do. The cost to deliver that demand will demonstrably be far lower than it ever has been with these new tools. Would you be able to hire that labor out to corporate AIs for even cheaper in absolute costs due to the benefits of mass production? Sure. But providing these things is a job for you too and it's "free" with just a bit of time and effort.
Tinkering with open source tools to assemble your first robot kit out of older hardware and 3D printed materials is not going to be prohibitively expensive. The cost to train it - probably not either, if the massive efficiencies we keep finding in models keep lowering and the community keeps sharing model tweaks. Make one robot with good enough dexterity and your second bot is a hell of a lot easier to make. These aren't going to take some ridiculously unheard-of materials or manufacturing processes. In fact, cheap AI chip alternatives to GPUs can be built on decades-old architectures designed to just maximize matrix multiplication with much simpler manufacturing. Monopolizing scarcities here isn't a sure bet. We've just been waiting for a good general-purpose brain. We have it now - and every bit of information we expose it to, the easier it gets to do anything with it.
Unless the big fancy AI wielders are coming for you with killer drones by then, this is all stuff people are going to be well-capable of while unemployed and living off food stamps, savings, or remortgaged houses. If they don't have the skills personally, they'll turn to friends and family who do and find mutual tribal support in tough times as people always do. Growing your own food, building your own infrastructure - all have been doable for a while, but are about to get stupidly easy with a few bots and helpful AI guidance. Normal humanity will carry on and pick up the pieces just fine in this new Dark Age, even as the corporates take the open field opportunity to chase for riches beyond our comprehension, mining asteroids and claiming the solar system.
Now imagine if those greedy corporates happened to just throw the rest of us a bone - 1% of their exponentially-increasing profits - as a PR gesture. Still would soon become far more wealth in absolute terms than the common people have ever seen in the history of earth.
If you think none of that is going to happen, then the alternative is a lot closer to the first people with AGI simply scouring the earth in a paranoid culling. Sure, it's entirely possible. But it takes a certain Next Level of Evil to make that happen.
And all that aside - if you really want to play up the capitalist dystopia angle, there's still plenty of individual value to be mined from people via a wage. Memory and preference mining, medical testing, AI fidelity comparison - plenty of reasons to pay people a little bit to steal what's left of their souls for even further improvement of AI. Might be enough for them to afford their first robots, even.
But by all means - go destroy corporate AI data centers if you think you can get away with it. Anything to tip the scales towards public / open source AI keeping up. But this tech is not going away, nor should it. It could very well result in unprecedented abundance for all, so long as things don't go ridiculously extremist.
Exactly, money is only useful for the exchange of resources. It's the resources we actually want.
In a world of AI those with access to AI can have all the resources they want. Why would they earn money to buy things? Who would they even be buying from? It wouldn't be human labours.
Dude, too pessimistic, next gen won’t be totally unemployable. Lots of professions up for grabs: roofer (they ain’t sending expensive robots there), anything to do with massage, sex work, anything to do with sports and performance so boxing, theater, Opera singing, live performance, dancing, military (will always need cheap flesh boots on ground), also care in elder facility for aging population, therapist (people still prefer interacting with a human), entertainer, maid cafe employee…
Perhaps we will finally reconnect with each other and quit the virtual life, as everything in the virtual world will be managed by and for other AIs, with humans unable to do anything but consume their content
> Dude, too pessimistic, next gen won’t be totally unemployable.
For what it's worth I agree with you, just with very low confidence.
My real issue, and reason I don't hide my alarmism on this subject is that I have low confidence on the timelines, but high confidence on the ultimate outcomes.
Let's assume you're right. If AI simply causes ~10%-20% of middle class workers to fall into the lower class as you suggest then I'd agree it won't be the end of the world. But if the optimistic outcome here is the near-term people won't be "totally unemployable" because people who lose their jobs can always join the working class then I'd still rather bomb the data centers.
If we're a little more aggressive and assume 50% of the middle class will lose their jobs in the next 10-20 years then in my opinion this is not as easy as just reskilling people to do manual labour.
Firstly, you're just assuming that all these middle class workers are going to be happy with being forced into the lower class – they won't be and again this isn't a desirable outcome.
You're also not considering the fact that this huge influx of labour competing for these crappy manual labour jobs will make them even less desirable than they already are. I keep hearing people say how they're going to reskill as a plumber / electrician when AI takes their job as if there is an endless demand for these workers. Horses still have some niche uses, but for the most part they're useless. This is far more likely to be the future of human labour. Even if plumbers are one of the few jobs humans will be able to do in a post-AI world then the supply of them will almost certainly far exceed demand. The end result of this excess supply is that plumbers going to be paid crap and mostly be unemployed.
I think you're also underestimating how fast fields like robotics could advance with AI. The primary reason robotics suck is because of a lack of intelligence. We can build physically flexible machines that have decent battery lives already – Spot as an example. The issue is more that we can't currently use them for much because they're not intelligent enough to solve useful problems. At best we can code / train them to solve very niche problems. This could change rapidly in the coming years as AI advances.
Even the optimistic outcomes here are god awful, and the ultimate risks compound with time.
We either stop the AI or we become the AI. That's the decision we have to make this decade. If we don't we should assume we will be replaced with time. If I'm correct I feel we should be alarmist. If I am wrong, then I'd love for someone to convince me that humans are special and irreplaceable.
People will just join the military ranks. We will need a ton of meat for upcoming WW3. This will solve the unemployment issue. Also, no need to “bomb data centers”, Russia will use EMP weapon for that.
I'm sure people felt similarly when the first sewing machines were invented. And of course, sewing machines did completely irreversibly change the course of humanity and altered (and even destroyed) many lives. But ultimately, most humans managed, and -- in the end (though that end may be farther away than our own lifetimes) -- benefited.
I'm not sure you're actually under-estimating the impact of this AI meteor that's currently hitting humanity, because it is a huge impact. But I think you're grossly under-estimating the vastness of human endeavors, ingenuity, and resilience. Ultimately we're still talking about the bottom falling out of the creative arts: storytelling, images, movies, even porn -- all of that is about to be incredibly easy to create mediocre versions of. Anyone who thrived on making mediocre art, and anyone who thrived second-hand on that industry, is going to have a very bad time. And that's a lot of people, and it's awful. But we're talking about a complete shift in the creative industries in a world where most people drive trucks and work in restaurants or retail. Yes, many of those industries may also get replaced by AI one day, and rapidly at that, but not by ChatGPT or Sora.
Of course you're right that our near future may suddenly be an AI company hegemony, replacing the current tech hegemony, which replaced the physical retail hegemony, which replaced the manufacturing hegemony, which replaced the railway hegemony, which replaced the slave-owning plantation hegemony, which replaced the guilds hegemony, which replaced the ...
You're also under-estimating how much business can actually be relocated outside the U.S., and also how much revolution can be wrought by a completely disenfranchised generation.
I get really surprised when seemingly rational people compare AGI to sewing machines and cars. Is it just an instinct to look for some historic analogy, regardless of its relevance?
I am absolutely not comparing AGI to sewing machines and cars. I am comparing ChatGPT and Sora to sewing machines and cars. My claim is that these are incredibly disruptive technologies to a limited scope. ChatGPT and SORA are closer to sewing machines than they are to AGI. We're nowhere near AGI yet. Remember that the original claim was that all 6-year-olds today will be unemployable. That's a pretty crazy claim IMO.
when machines reduced physical labor, displaced people moved to intelectual and creative jobs; tell me, what kind of work will be left for human if ai will be better at intellectual and creative tasks?
100% agree in principle, but the unfortunate answer to your question is: because the people who already own everything won't allow that to happen. Or, at least, not without a huge fight.
The problem with applying the horse-automobile argument to AI is that this time we don’t have anywhere to go. People moved from legwork to handwork to thinking work and now what? We’ve pretty much covered all the parts of the body. Unless you like wearing goggles all day nobody has managed to replicate an attractive person yet so maybe attractive people will have the edge in the new world where thinking and labour are both valueless.
Humans seems to always find a way to make it work, so I’d tell them to enjoy their younger years and be curious. Lots of beauty in this world and even with a shit ton of ugly stuff, we somehow make it work and keep advancing forward.
He will be in the same boat as the rest of us. In 12 years I expect the current crop of AI capabilities will have hit maturity. We will all collectively have to figure out how life+AI looks like, just as we have done with life+iPhones.
It will be difficult to keep up proper levels of intelligence and education in humanity, because this time it is not only social media and its mostly negative impacts, but also tons of trash content generated by overhyped tools that will impact lots of people in a bad way. Some already stopped thinking and instead consult the chat app under the disguise of being more productive (whatever this means). Tough times ahead!
It's not his choice. It's the choice of the ruling class as to whether they will share the wealth or live in walled gardens and leave the rest of us in squalor outside the city walls.
It is his (parents') choice in terms of whether he reaches for the tools that are just lying around right there. We can run AI video on consumer hardware at 12fps that is considerably less consistent than this one - but that's just an algorithm and model training away. This is not all just locked up at the top. Anyone can enter this race right now. Sure, you're gonna be 57,000th at the finish line, but you can still run it. And if you're feeling generous, use it to insulate your local community (or the world) from the default forces of capitalism taking their livelihoods.
We'll have to still demand from the ruling class - cuz they'll be capable of ending us with a hand wave, like they always have. But we can build, too.
There's no evidence to suggest what you say is true, so I would tell them to simply go to college or trade school for what they are interested in, then take a deep breath, go outside, and realize that literally nothing has changed except that a few people can create visual mockups more quickly.
AI still can't drive reliably. AI isn't sure if something is correct or not. AI still doesn't really understand anything. You could replace AI with computers in your sentence and it would probably be a very real worry that people shared in 1990. Theres always been technology that people are afraid will drastically change things, but ultimately people adapt and the world is usually better off.
Did anyone else feel motion sickness or nausea watching some of these videos? In some of the videos with some panning or rotating motion, i felt some nausea like sickness effect. I guess its because some details were changing while in motion and I was unable to keep track or focus anything in particular.
Yeah, these all made me feel incredibly nauseous. I was trying to figure out what aspect of the motion was triggering this (bad parallax?) but couldn't. The results are impressive but it's still amazing to me how little defects like this can trigger our sense of not just uncanniness but actual sickness.
I do. My hypothesis is that there isn't really good bokeh yet in the videos, and our brains get motion sick trying to decide what to focus on. I.e. too much movement and *too much detail* spread out throughout the frame. Add motion to that and you have a recipe for nausea (at least for now)
You can shoot with high depth of field and not cause motion sickness. Aerial videography does that every day, and it's no more difficult in general to parse than looking out an airliner window or at a distant horizon would be.
I suspect GP is closer to on the money here, in suspecting the issue lies with a semblance of movement that isn't like what we see when we look at something a long way away.
I didn't notice such an effect myself, but I also haven't yet inspected the videos in much detail, so I doubt I'd have noticed it in any case.
I think I feel a bit of queasiness but more from the fact that I'm looking at what I recognize as actual humans, and I'm making judgements about what kinds of people they are as I do with any other human, but it's actually not a human. It's not a person that exists.
I question how much anyone has really used these models if they actually think these systems can replace people. I’ve consistently failed to get professional results out of these things and the degree of work required to get professional results makes me think a new class of job will be created to get professional results out of these systems.
That being said, there is value in these systems for casual use. For example, me and my girlfriend got into the habit of sending little cartoons to each other. These are cartoons we would have never created otherwise. I think that’s pretty awesome.
The more I use them, the more I get a sense of something fundamental that's missing, and the less I worry about losing my job. It's hard to describe, I need to think harder about what that feeling is.
Most people who work in "the arts" probably aren't communicating anything directly either - they just create the scenes, sound effects, textures, animation, models +++ that someone above them in the organization has asked them to create for their project.
What's the difference between having an idea, then putting an actor on a set, lighting them, doing background green screen set extension afterwards, digital clean up, etc. vs doing all of that generatively?
How is asking a VFX house for animated footage any different than generating it? If art is intent, there is no reason you can't generate the building blocks that reflect that intent, no?
This is the second time OpenAI has released something right at the same time as google did (Gemini 1.5 Pro with 10M token context length just now). Can't just be a coincidence
And the tech demo of GPT-4 was Sam interacting with the thing and showing what it did well and where it faltered. We could also access the thing soon after. Not so with Gemini. Hell even Mixtral got me more excited.
I too noticed the coincidence. Not to be a conspiracy theorist but part of me wonders if they share this information with each other, or if OpenAI has advancements like these sitting in the chamber and they are willing wait a few weeks before they release them to maximize the impact of the timing.
So the top two stories are about a model that can generate astonishingly good video from text and a model that has a context window which allows it to process and identify nuanced details in an hour long video.
We've fairly quickly moved from a world where AIs would communicate with each other through text to one in which they can do so through video.
I'm very curious how something like Sora might end up being used to generate synthetic training data for multimodal models...
The relevant state of the art here, is the state of "what can an 8-year old kid who just learned how to type" create videos of. That was even worse 12 months ago!
OpenAI demonstrating the size of their moat. How many multi-million-dollar funded startups did this just absolutely obsolete? This is so, so, so much better than every other generative video AI we've seen. Most of those were basically a still image with a very slowly moving background. This is not that.
Sam is probably going to get his $7T if he keeps this up, and when he does everybody else will be locked out forever.
I already know people who have basically opted out of life. They're addicted to porn, addicted to podcasts where it's just dudes chatting as if they're all hanging out together, and addicted to instagram influencers.
100% they would pay a lot of money to be able to hang out with Joe Rogan, or some only fans person, and those pornstars or podcasts hosts will never disagree with them, never get mad at them, never get bored of them, never thing they're a loser, etc.
These videos are crazy. Highly suggest anybody who was playing with Dall-E a couple of years ago, and being mindblown by "an astronaut riding a horse in space" or whatever go back and look at the images they were creating then, and compare that to this.
> OpenAI demonstrating the size of their moat. How many multi-million-dollar funded startups did this just absolutely obsolete?
For posterity since the term has been misused lately, having a very good product isn't a moat in the business sense. There's nothing stopping a competitor from creating a similar product (even if it's difficult), and there's nothing currently stopping OpenAI's users from switching from using Sora to a sufficient competitor if it exists.
It definitely is. Having the best product and being able to maintain that best-in-class product status over time through a firm's 'internal capabilities' is very much a moat and a strong one at that. A moat is the business strategy sense is anything that enables a firm to maintain competitive advantage. Having the best product in a category, and being able to maintain that over releases is a strong competitive advantage (especially when there is high willingness to pay or price is a strong competitive dimension compared to the value created).
That's not a real moat except in one sense: if it is really expensive to get to the level to compete, and you know a competitive market would bring margins near zero, then no competitor may actually step up. We see this in off-patent drugs, where it may have 200X margins but no competitor will go through the FDA manufacturing reapproval process because they won't actually get those margins if they begin competing on price, and then the sunk cost of getting to the competitive level isn't worth anything for them.
I think OpenAI's big moats are in userbase feedback and just proprietary trade knowledge after they stopped sharing model details. They may have made some exclusive data source deals with book/textbook and other publishers, though it isn't clear a license is actually needed for that until things work through the courts.
The original "We Have No Moat, And Neither Does OpenAI" leaked memo from Google that memefied the term focuses explicitly on the increasing ease of competitors (especially open-source) entering the ecosystem: https://www.semianalysis.com/p/google-we-have-no-moat-and-ne...
Second: Massive capital expenditure, specifically in this case the huge cost of building or leasing enormous GPU clusters, is *exactly* what he means by this.
> What we're trying to find is a business that, for one reason or another -- it can be because it's the low-cost producer in some area, it can be because it has a natural franchise because of surface capabilities, it could be because of its position in the consumers' mind, it can be because of a technological advantage, or any kind of reason at all, that it has this moat around it.
He didn't seem to have specific definition at all really.
I think most people attribute it to a "secret sauce technology" in the case of OpenAI, I'm not sure if "finances to lease a huge cluster of GPUs" makes sense here because the main competitors (Google, AWS, Apple, etc) also have access to insane compute as well yet have struggled to get close to GPT4's performance in practice.
That said I do agree that it's a moat for the startups like stability/mistral, etc. They also have access to $/compute, albiet a lot less. And you can see this in their research, as they've been focused on methods to lower the training/inference costs.
I believe that Google actually has more AI compute at their disposal than OpenAI. They have been building out their TPU infrastructure for a while now. OpenAI is reliant on Azure obtaining nvidia GPUs.
So at least in the battle between OpenAI and Google, their moat right now are their models.
I disagree mainly because google, aws, apple, etc. All have similar, or even more access to GPU compute and funding for it, and in google's case also has been one of the main research contributers, yet they still struggle to touch GPT4's performance in practice.
If it was as simple as dropping 10's millions on compute they could do that, yet google's bard/gemini have been a year behind GPT4's performance.
That said I do agree that it's a moat for the startups like stability/mistral, etc. They also have access to $/compute, albiet a lot less. And you can see this in their research, as they've been focused on methods to lower the training/inference costs.
*I'm measuring performance by the chatbot arena's elo system and r/locallama
I agree it isn't a moat in the business sense - that would be some kind of lock in network effect.
e.g. If ChatGPT being popular gives OpenAI enough extra training data, they're locked in forever having the best model, and it is impossible for anyone - even with unlimited money, and the same technology - to beat them. Because they don't have that critical data.
Yes, Google had the best search product, and got a huge market share simply by being better. Their moat however is that their search rankings are based off the click data of which search results people use and cause them to stop their search because they've found a solution.
They also have a moat to do with advertising pricing, based on volume of advertising customers.
Bing spend a lot of capital, and had the tech ability, but those two moats blocked them gaining more than a tiny market share.
In this case, maybe OpenAI will have a video business moat, maybe they don't...
Google, Microsoft and Facebook have capital and compute. That is not an OpenAI moat.
Facebook has Moat because of their social network. It is very hard to switch to another network. Google with search has no moat because it is easy to switch to a new search engine. OpenAI has no moat because it is easy to switch to a new AI chat once a better product becomes available. AWS has moat because it is hard to switch cloud providers. Apple has moat because people want to buy apple products. etc.
A moat can be seen where even if you have a worse product than the competition, or users hate you, they still use your products because the cost to switch is immense.
Being (a) first and (b) good enough is a moat. Nothing stopped people from switching from google to bing all these years other than not having any reason to.
> There's nothing stopping a competitor from creating a similar product
This is like saying there’s nothing stopping a competitor from launching reusable rockets into space. Of course there isn’t, but it’s hard and won’t happen for the foreseeable future.
Similarly with a physical moat, it’s not impossible to cross, but it’s hard to do.
It’s not the same because there is basically no cost to trying an OpenAI competitor. Betting your payload on an up and coming rocket company is a major business risk.
The point is that "moat" gets conflated with just being ahead in the game. I don't find it a super interesting point of contention, but there is a distinction alright.
"How many multi-million-dollar funded startups did this just absolutely obsolete?"
The play with AI isn't to build the tools to help businesses make money, the play is to directly build the businesses that makes the money.
In practice this means, don't focus your business model on building the AI to make text to video happen. Your business model should be an AI studio, if the tech you need doesn't exist, build it.... but if you get beat by someone with more GPU's and more data, cool use the better models. Your business model should focus on using the capability not building it. It's proving quite hard to beat someone with more GPU's, more data, more brain power.
But then you're stuck playing in the model owner's playground and if you're too successful they can yank the rug from under you and steal your business any time they want.
Indeed, they're letting all of these businesses and professionals subscribe to the gold mining equipment - but retaining ownership of it, and they'll be able to undercut those services and cut people off as they please.
This is effectively what Amazon does. Provide the infrastructure to make money selling things, then let merchants de-risk their R&D into what sells best and would be most profitable, then sell their own version of it.
I predict, this "AI" content generation will eat itself at last. It will outcompete the low-effort "content" industry as is. Then inevitably completely devalue this sort of "product". Because it will never get to 100% of the real thing, the "AI" content craze will ultimately implode.
I bet we won't get AGI as a progression of this very technology. The impression of "usefulness" will end when "AI" is starting to drink its own Koolaid on a large scale (copilot lol), and when everyone starts using it as super inefficient business interface. Overfitted mediocre mediocrity, on steroids.
Hopefully, this sobriety happens before the economy collapses, as a consequence of all dem bullshit jobs cleansed.
I think this analysis is flawed. New technologies are usually bad at substituting for things that already exist. It's 100% true this will not substitute for the existing genre of film and video.
New technologies change the economics of how we satisfy our needs.
When search engines became good, many pundits confidently predicted Google would never replace librarians or libraries. It didn't. It shifted our relationship to knowledge; instead of having to employ an expert in looking things up, we all had to become experts at sifting through a flood of info.
When the cost of producing art-directed and realistic video goes to zero it's hard to predict what's going to happen. Obviously the era of video = veracity is now over. And you can get the equivalent of Martin Scorsese and a million dollar budget to do the video instructions for a hair dryer. Instead of hunting for a gif to express how you feel, captured from an existing TV show or something, you could create a scene on the fly and attach it to a text message. Or maybe you dispense with text messages altogether. Maybe text is only for talking to computers now.
My personal prediction is that the value of a degree in art history is going to go way, way up, because they'll be the best prompt engineers. And just like desktop publishing spawned legions of amateur typesetters, it will create lots of lore among amateur video creators.
I haven't seen a lot of use cases outside of productions and businesses, which shouldn't exist in the first place (at least to this extent).
Some of our "needs" are flawed, since "content" speaks to evolutionary relicts developed in times of scarcity and life in small groups. In the unbounded production of "AI", there is no way to keep up the sense of newness of input indefinitely. I am already fatigued by "AI" """art""". It has no real relevancy. You can't trust any of it.
Every medium where "AI" content becomes prevalent, will lose it's appeal. E.g. if I get the impression a significant proportion of comments here were "AI" generated, I will leave HN. Thing is, all these open platforms can't prevent "AI" spam. So they will die. Look at the frontpage of Reddit... it's almost all reposts, by karma farming bots. Youtube "AI" spam already drowning real content. This is what's going to happen to everything. User content will die. "Content" will die. The web will die. You won't even try, because of "AI" generated fatigue.
> My personal prediction is that the value of a degree in art history is going to go way, way up, because they'll be the best prompt engineers.
Lol. Yeah, "best prompt engineer" in the infinitely abundant production economy...
You people really need to iterate the world you are imagining a few times more and maybe think about some fundamentals a bit.
Do people care about 100% of the real thing though? Phone photos are oversaturated and over-sharpened. TikTok and other social media videos are more often then not run through filters giving their creators impossibly smooth skin and slim waists along with other effects not intended to look in any way realistic. Almost every major motion picture has tons of visual effects that defy physical reality. Nature documentaries have for decades faked or sweetened their sound production, staged their encounters with wildlife, etc.
People are more concerned about being stimulated than they are about verisimilitude.
AI is more akin to a zero sum game. It won't add 10% to the global economy (and if it did - it would be around "peak of inflated expectations" and, likely, have a corollary slide down into the "trough of disillusionment") because it will both distract budgets and/or redirect budgets. That hypothetical $7T is not coming out of thin air. I'd even go as far to argue that this hype cycle will ultimately detract from global economy over time as it's a significant draw on resources that could have been / would have been used on more productive efforts long term.
This reads like it could be used to reason against the industrial revolution or the first computer revolution or any other significant advance in human history. Am I missing something?
If he had, it would've been a bargain for the impact of the industrial revolution.
Watt couldn't have asked, his engines specifically weren't enough of a difference by themselves even though the revolution as a whole was, and I strongly suspect this is also going to be true for any single AI developer; however a $7T investment in many unrelated chip factories owned by different people and invested over a decade, is something I can believe happening.
The industrial revolution wasn't a leech on resources for little to no value. Most of the energy and diverged efforts by companies globally is currently being wasted on efforts to try and figure out how to profit from this "revolution". This isn't a revolution, this currently looks like a heist of epic proportions.
If the industrial revolution wasted the majority of its input for low value / unneeded output it wouldn't have been a revolution. Please enlighten me on how LLMs have revolutionized the world and then feel free to share how much energy, money and time have been sunk thus far with little to show as a tangible increase in the lives of a global human population.
It is a future projected value of a company. You can not realize it. If you start selling stocks, they will drop at a rapid pace. The entire stock market is in a way projection of all future money the stocks will potentially make for a long time. This is not liquid cash that can be injected for any other purpose.
OpenAI's moat is (a) talent (b) access to compute (c) no fear of using whatever data they can get.
On the other hand, I think these moats will be destroyed as soon as anyone finds a drastically more efficient (compute- and data-wise) way to train LLMs. Biology would suggest that it doesn't take $100 million worth of GPUs and exaflops of compute to achieve the intelligence of a human.
(Of course it is possible that at that point, OpenAI may then be able to achieve something far superior to human intelligence, but there is a LOT of $$$ out there that only needs human levels of intelligence.)
> Biology would suggest that it doesn't take $100 million worth of GPUs and exaflops of compute to achieve the intelligence of a human.
Biology suggests that a self-replicating machine can exist by ingesting other machines, turning them into energy and then using that energy to power themselves. Biology suggests that these machines can be so small that we cannot even see them.
I believe that synthetic biology had succeeded already a few years ago in making artificial cells with a fully synthetic genome designed by us with what is sufficient for the cell to eat, grow and replicate, se we already can design and make such 'machines'.
So make a biological AI then. What the parent was saying is that 'biology can do it with organic materials, so we should be able to do it with electronics".
There's nothing obviously wrong with assuming that "biology can do it with organic materials, so we should be able to do it with electronics" - while it's theoretically possible that we'll eventually identify some fundamental obstacle preventing that, as far as we currently know, computation is universal and the only thing that depends on the substrate is efficiency.
Since we have a much, much better industrial process for manufacturing electronic components, why attempt to make a biological AI if there's no current reason to believe that it being biological is somehow necessary or even beneficial?
> 100% they would pay a lot of money to be able to hang out with Joe Rogan, or some only fans person, and those pornstars or podcasts hosts will never disagree with them, never get mad at them, never get bored of them, never thing they're a loser, etc.
This is the stuff of Brave New World. It's happening to us in real time.
> Sam is probably going to get his $7T if he keeps this up, and when he does everybody else will be locked out forever.
I would be extremely surprised if he could get past the market cap of all current corporations as an investment. That doesn't mean "no, never"[0], but I would be extremely surprised.
$7T in one go would be 6.7% of global GDP, and is approximately the combined GDP of Japan and Canada.
> These videos are crazy. Highly suggest anybody who was playing with Dall-E a couple of years ago, and being mindblown by "an astronaut riding a horse in space" or whatever go back and look at the images they were creating then, and compare that to this.
Indeed, though I will moderate that by analogy: it's been just over 30 years since DOOM was released, and that was followed by a large number of breathless announcements about how each game had "amazing photorealistic graphics that beat everything else" while forgetting that the same people had said the same things about all the other games released since DOOM.
Don't get me wrong: these clips are amazing. They may not be perfect, but it took me a few loops to notice the errors.
I'm sure there are people with better eyes for details than me, who will spot more errors, spot them sooner, and keep noticing them long after GenAI seems perfect to me.
But I also expect that, just as 3D games' journalism spent a long time convinced the products were perfect when they weren't, so too will GenAI journalism spend a long time convinced the products are perfect before they actually are.
[0] a sufficiently capable AI is an economic power in its own right. I previously guessed, and even with it's flaws would continue to guess, that the initial ChatGPT model was about as economically valuable to each user as an industrial placement student, and when I was one of those I was earning £1k/month (about £1.7k/month when adjusted for inflation).
Yes, the 'special effects' effect will kick in. Within a year or so, you'll spot this easily, quite aside from the more obvious issues. (That Landrover captioned 'DANDOVER' - is this still using BPEs?!)
Aside from visual plausibility, there's also the issue of physics: one of the things you would like to use video models for is understanding real-world physics and cause-and-effect for planning or learning _in silico_. Something may look good but get key physics wrong and be useless for, say, robotics.
> 100% they would pay a lot of money to be able to hang out with Joe Rogan, or some only fans person, and those pornstars or podcasts hosts will never disagree with them, never get mad at them, never get bored of them, never thing they're a loser, etc.
I think immersive games will also be a big application. Games AI will also benefit from being more strategically intelligent and from being able to negotiate, in a human-like fashion, with human and other AI players. The latter will not only make games better, it will also improve the intelligence of AIs.
I don't buy that. People form fan communities around these podcasts where they talk with real people about how much they love listening to minor internet celebrities talk about nothing. Why would they do that if the podcasts served that purpose already?
I think rather than replace real human contact, the internet has created an increased demand for it. People need every moment of their lives to be filled with human speech or images.
If I were to take off my "reasonable point" hat and put on my "grandiose bullshit" hat I'd say that in the same way drugs can artificially stimulate various "feel good" parts of your brain, we have found a way to artificially stimulate the "social animal" instinct until we're numb.
I think the real risk of this kind of AI is not that people live in a world of fake videos of their favorite celebrities talking to them, but that entire fake social media ecosystems are created for each individual filled with the content they want to see and fake people commenting on it so they can argue with them about it.
Everybody needs to read The Three Stigmata of Palmer Eldritch by Philip K Dick.
I may be having a hypomanic episode, but I've been thinking about it more, and it seems like the entire Internet Age has been an attempt to more precisely synthesize the substance which sates human social needs artificially, and that when they perfect it, it's all over.
I've been thinking along those lines too, but more from the angle that our goal is to eliminate any need to rely on other humans for anything. We consider the need for interacting with other humans as a burden and an inconvenience, and we're going to get rid of it, at the cost of all the indirect benefits we got from being forced to do it.
That's what Twitch has become too. The most popular Twitch streamers do nothing other than watching YouTube videos and providing a fake relationship to their 50,000 live viewers.
It looks like some people are just learning that introverts exist. Maybe there's something interesting about how more common it is, but none of this is new.
I agree with much of what you say, but I'm not sure the dystopian conclusion is the main one I'd draw.
Improving your ability to connect with and enjoy/learn from people all around the world is one of the main value props of the internet, and tech like this just deepens that potential. Will some people take this to an unhealthy degree that pulls them too far out of reality? Yes. But others will use it to level up their abilities, enrich their lives, create beautiful things, and reduce loneliness.
seems like a significant chunk of the population may opt in to the Matrix voluntarily.
on another note I find it funny they released this right after Google announced their new model. Bad luck for Google or did OpenAI just decide to move up their announcement date to steal their thunder?
If there is a high fidelity nice simulation of a pleasant world, and the actual real world is a hellscape, what is the problem with that?
If you were presented with the fact that whatever your life is is just an illusion, and you are actually a starving slave in North Korea, you would choose to "wake up"?
Well, there are huge downsides to using cocaine, whether it is undesired health impacts, or addiction, or threat of arrest, or mere cost, or even just social stigma.
I'm not sure there are downsides to living out your life in a simulation while robots take care of your physical form.
actually the opposite imo, this stuff is the ultimate bread and circus to distract poor people from worsening living conditions. Much cheaper to provide VR goggles with AI model access than housing and healthcare
As long as sex is the competition, I don't think that's likely. Simulating orgasms will require the Apple iPleasure Maxxx implant and expensive brain surgery & recovery.
I think there are people for whom the fundamental assumption that someone will want "more" of stuff they already like does not hold, and that while those people are a minority, recent developments in the media landscape toward a constant stream of increasingly similarity-curated media has caused them to increasingly disengage from media consumption
That said, those people are by definition less relevant to internet consumption metrics
Even with 7 trillion, he is still going to need a national grid that can supply the power for the compute.
There is a lot that has to planned and put in place now to get there.
As for people that have opted out of life. We would have a better world if we started encouraging more dreamers/doers like out of the movie Tomorrowland.
>100% they would pay a lot of money to be able to hang out with Joe Rogan, or some only fans person, and those pornstars or podcasts hosts will never disagree with them, never get mad at them, never get bored of them, never thing they're a loser, etc.
All of these things are against the terms of service and attempting them may result in a ban.
Is there an open-source GPT4 equivalent right now? Doesn't seem like anything has taken off and gotten rave reviews on the level of OpenAI's offering yet.
I'm fairly sure $7T is a speculation bubble, and that's going to pop like all bubbles pop. It's the combined GDP of Japan and Canada. It's too big for an investment.
It's not necessarily too big for a valuation, as a sufficiently capable AI is an economic power in its own right: I previously guessed, and even despite its flaws would continue to guess within the domain of software development at least, that the initial ChatGPT model was about as economically valuable to each user as an industrial placement student, and when I was one of those I was earning about £1.7k/month when adjusted for inflation, US$2.1k at current nominal exchange rates. 100 million users at that rate is $2.52e+12/year in economic productivity, and that's with the current chip supply and (my estimate of) the productivity of a year-old model — and everyone knows that this sector is limited by the chips, and that $7T investment story is supposed to be about improving the supply of those chips.
Looks like they have made large progress in hand generation. They still look like claws a bit but you didn’t have to add a workaround for the query to render correctly and I had to zoom in to verify . When I was watching it the first time I didn’t even notice hand issues.
it wouldn't be too difficult to make a tiktok like app that created tailored prompts for sora based on the user and tracking data. Question is whether it is profitable
Hopefully, the line between the real world and virtual world gets stronger once again.
It's not easy to do that with actors. It costs money, you need to get props, find a location, schedule the shoot, etc. People who lose their minds over petty grievances will sober up long before their video is produced.
With AI video generation you could produce multiple videos per day, each one customized to be highly targeted for a local market. Actors can be generated to represent a local minority that is villainized by politicians and the clothing and set customized for the locale.
Then you can automate posting it all over social media with fake AI generated discussions calling for a revolution. Even if the video gets flagged as fake, you can upload a thousand more. As a bonus, add comments along the lines of "of course THEY want you to think this is fake! Don't be fooled!" in order to appeal to the paranoid lunatics who are most likely to get the ball rolling.
In conclusion, I believe this is a solid startup idea. Thank you all for coming.
Does anyone else feel a sense of doom from these advancements? I'm definitely not a Luddite, I've been working professionally as a programmer for quite some time now, but I just can't shake this feeling. And this is not in the "I might lose my job to this" kind of feeling, that's obviously there, but it's something deeper, more sinister. I don't think I can explain it properly.
Anyway, videos look incredible. I genuinely can't believe my eyes.
I feel the opposite. I've been overwhelmed with a sense of my mortality, that I need to care for my body better, in order to live as long as possible into this age. I feel like I won the lottery of birth date to be able to see this. I get your perspective, and I have no doubt the wealth gap will widen painfully, but I'm also optimistic about humanity's ability to work it out.
That's a therapist question. Probably from an engineering career, surrounded by smart folks for whom succumbing to "we're doomed" was never an option, and a solution was something you beat your head against a brick wall for the 999th time about. You just get used to things turning out alright. But the topic has a lot written over the centuries by people who can write better than I https://en.wikipedia.org/wiki/Technological_unemployment
Yes history. If you ignore the clickbait headlines designed to elicit rage, and the news feeds designed to spiral you into a cycle of fear, and just google "poverty graph" to find raw data sources you'll find it's generally a good trend like https://blogs.worldbank.org/opendata/dataviz-remake-fall-ext...
Or appeased with a tiny fraction of the total gains - just enough to keep a middle class happily with their basic little toys while wealth inequality grows.
This could easily be the same, except the toy is "you don't have to work anymore and here's some houses and robot chefs! Now play nice while the adults go build star fleets"
It allows the technical possibility for a post-truth reality, where it's impossible to tell what's true and what isn't. Every piece of information fed through your machine and smartphone. That's the scariest part to me. We need to get ahead of that, because certain interests will be fabricating things with it.
As jobs go, well, we're a long ways from full automation but this represents some serious growing pains that will decimate certain jobs and replace them with few. Not sure what the reaction will be on the consumption side, revulsion or enthusiasm. The "handcrafted" market will still be there but then you wouldn't really know if any AI was used. In a long enough timeline we can hand-wave this away with UBI/negative tax.
But ah, the most at-risk workers are the professional services, white-collar upper-middle class types, even engineers but to a lesser extent. So I wonder what kind of upheaval that would cause.
Certain interests are already fabricating voices in political robocalls in New Hampshire. I chill at what the US will see as we approach the Presidential election this fall. Then again, maybe it will give us an early taste to better prepare for what is to come.
For me, it's my kid. He's just turned three. He had just turned two when GPT4 was announced.
Going back generations, my grandparents' lives were virtually identical to my great grandparents'. My parents grew up with radio, but they were adults by the time TV changed their world. All three generations got the bulk of their information from books and newsprint.
I grew up together with computers. I remember riding that exponential wave of tech like a surfer. From Commodore 64 to a laptop with 64GB of memory, a million-to-one ratio. Tetris to Doom Eternal. Dialup modem to gigabit... in a mobile device that fits in my pocket.
All of this took decades, but now changes like this happen in months.
I keep thinking that "this tech will change my kid's childhood", but what "this" is, is already outdated and being replaced in a blink of an eye, and he hasn't even reached that point yet where he'd notice!
When image generators were first released... what... a year ago... I thought: Wow! One day, when my kid is a little older, I'll be able to use this to create illustrations for stories we make up as we go along! Won't that be great!
I still haven't gotten around to that yet, he's still too young to appreciate that, and anyway, with this Sora I'll be able to create video instead by the time he's old enough!
I keep trying to imagine what his life will be like when he grows up to be a teenager, but realistically I'm having a hard time predicting what will already be outdated by the time he's four.
The compute and innovations behind it should be owned by the planet, not by a handful of billionaires. It is far too powerful to be controlled by such a small group of humans, who decide what is "safe" and what isn't.
It took billions of years for all of our ancestors to enable this technology, and now a handful claim it for themselves. The GPUs to run these models cost $20,000+ each, and only the ultra-rich can afford to have that compute.
Compute power needs to be radically redistributed and equalized across the board. This is too much power.
Actually, you can live-render around 12fps videos on a consumer gaming rig using software installable in a night ($3k). Not as fancy internally-consistent videos as these, but still impressive - and that's just an algorithm update and model download away. And every second a corporate AI model is exposed publicly to the world that's more training that can be siphoned to open source models at far more cost effective rates than the initial leaders.
You're impressed by the lions. But us hyenas and vultures will get our turns still too. This is not over. Information innately diffuses.
The open-source solutions on anything besides image gen are like toys compared to the corporate owned ones -- and even image gen is behind DALLE3. I built video generation on top of stable diffusion 1.5 when it first came out, getting better results than what I've seen published, but it was no where close to this.
A conspiratorial part of my mind feels it is orchestrated; give the masses old / misdirected code so their work goes into dead-ends that can never achieve the results corporate is hoarding. Open Source hasn't even scratched at GPT4 yet, and that is approaching a year old.
The power dynamics need to radically shift. Corporate cannot own all this compute and brain power when it involves birthing AGI. That will create an instant and permanent divide the likes of which will never, ever be cross; you will either be an owner of intelligence indistinguishable from a god, or you will be a mortal. Even the RISK of this happening is laughable that it is being allowed.
We need radically redesigned government, regulation, and public involvement, and we need it yesterday. AGI is a Earth-wide, publicly owned effort, it cannot be relinquished to the owner / slaver class of this planet -- that is madness.
In two posts, you went from ~"it's a travesty that only 5000 people have access to the technology that will soon own the world", to ~"it takes at least three years for the state of the art to run on a box owned by myself in my bedroom".
Why are people so willing to trust such a small minority with power like that?
If there is even a 1% chance that they decide to "cut the cord", it is still too high. Once AGI is achieved, there is no coming back. Minutes will be like an eternity, and days, let alone years, will be beyond that.
Trusting such a small number of people with that kind of power is obscene, especially when history has time and time again shown what humans do to things which are no longer useful to them.
There has never been a divide like humans w access to AGI and regular humans before. It will be greater than the difference between a human with a modern cell phone and a carrier pigeon -- the pigeon itself, not a human using it.
My money is on the cord being cut. That means as soon as AGI is achieved, an impenetrable two-tiered human species is created; one with AGI backed intelligence, and the majority, without it, or with a dumbed down version as a transitional cookie until a more final solution can be realized. Once it reaches this stage (without any change to public governance over AGI), it will be too late.
We are already a stone's throw from this reality, if it hasn't already happened.
Why is my money on this outcome? Because that thinking brings about necessary change. We as a people have the power to prevent it 100%, and we ought to, now. Instead of relying on the outlandish chance that a historically malevolent elite suddenly gains benevolence and shares access from the kindness of their hearts to a boundless intelligence, there is a window to force access for all. Everyone having access is autonomously balancing.
Sounds like we need a public option funding and training foundational models and fueling public research which can outpace corporate models in money and brainpower. This could be a thing if governments weren't horrifically corrupted by corporate interests already, or if we could get off our asses and build some sort of decentralized swarm compute network. I do agree, in terms of raw resources (capital) we are far behind and unorganized. In terms of collective brainpower which could be applied if done right... eh, this shit doesn't seem nearly as hard as the ML experts portray it. Either we're being fed a worldwide false reality bubble, or there's a plethora of low-hanging fruit being discovered daily from just a few foundational model finds, and even though the big firms are gonna scoop those up first it's gonna be pretty hard to ever hide that information for long
Of course they're gonna get a few years heads start - which will feel like centuries in AGI time - but either they wipe us all out during that, or we're gonna put the garbage together and make our own AGI too.
Yes, everyone considers themselves a dyed in the wool capitalist until circumstances lead them to realize the difference between the sheep and the wolves.
I think of it like: The only reason humans still drive cars is we have yet to find a good enough way of replacing ourselves with something more effective. It's merely an implementation detail of "getting from A to B" that would be disrupted if a true autonomous solution was discovered. Many would want to optimize away drunk drivers and road rage if it were possible in some faraway future. So something like a steering wheel could be seen like a compromise of sorts, until the next big thing makes them obsolete.
That, and the state of missing a technology in a period of time is irreplaceable once it's been discovered. Nobody can live in an era without social media anymore, barring a global-scale catastrophic reset. So I believe it's important to consider what technology is not yet totally pervasive, for example by realizing there is still a steering wheel for you to grip in your car.
And in my mind, the sinister feeling stems from the fact that all it takes to irreversibly shift society like that is enough smart people with honest intentions but little foresight of what will happen in a few decades as a result of proliferating all this. The problems that result stop being in anyone's control, "throwing it over the wall" so to speak, and instead become yet another fact of life that could weigh us down (mostly I think of the ubiquity of social media and how it has changed human interaction). And it all stems from just a few engineering type people getting overexcited about cool possibilities they can grasp at, not considering there are billions of people unlike them who may have other ideas.
I have felt the same since Stable Diffusion came out.
The thing is, things have value in society partly because human efforts were involved in its making. It's not just about the end result; people still go to concert on top of listening to studio recordings for example, and people still watch humans play chess even though it's clear that good enough algorithms can beat the best humans easily. Technology like these which takes away too much immediate effort (hours needed to create the product) and long term effort (decades of training) are inherently absent of underlying value that I spoke of. Of course, if a person is only interested in consumption, it matters not how the "thing" is created.
Much of the sense of doom I have comes from the inherent erosion of this human effort element in the creative process. Whether we like it or not, the availability of mass produced content naturally threatens crafts themselves. After all, nobody wants to spend a few decades on their skills only to have their creation compared to an AI generated image produced in a few seconds.
I understand there are a lot of hypes around these technology to "humanity" but I have yet to see it. It just feels like more power consolidation to billionaires (especially when done as ClosedAI). There are artists who have tried to incorporate these but they have always felt the need to willingly not label their work as AI-generated or AI-assisted to sell (but still leaves in enough details for keen observers to tell it's AI touched).
As a whole, it just feels wrong. The most optimistic (and reasonable) take I have seen is "Just wait and see". It might feel like a non-argument, but it's the only realistic take between the hyped up techbros and the doomer cult (admittedly, I might belong to the latter group).
I think one of the most worrying thing for me is that regardless of how this plays out, this technology has only added more complexity to our society. That people are divided into camps about how they feel about the technology is simply a symptom about how much uncertainty there is in the future. This last bit will be a personal quarrel, but I personally lose any last desire to have children seeing the AI advancement. It's not right creating sentient life in an age where every year people have to play lottery to see whether technological advancement has deemed their life long effort unworthy.
I think you're right. A large part of the joy from creative endevours is actually getting good at something, and having other people enjoy your work. In the face of instant high quality generative AI placating the entertainment needs of the masses, we are creating a society where most people are unable to enjoy human creative expression, in part because human artists are just too slow. Attention spans are already shrinking, and after getting used to generative AI, few people will have the patience to wait for an author to write the second part of his magnum opus.
Feeling anything else other than concern is unpopular only on this website where most believe the entire world is cute like the nerds hanging out at the office's game room.
Nothing good is coming out of this. I don't give a shit if you believe this is Luddism.
Obviously concern yourself with your job and what you need to do to ensure you can obtain buying power going forward, but most problems and concerns about things like these go away if you just turn off your tech, or really be intentional about your usage.
Extremely hard to do, it is, but you’ll become quasi-Amish and realize how little is actually actionable and in our control.
You’ll also feel quite isolated, but peaceful. There’s always tradeoffs. You can’t have something without giving up not-something, if that makes sense.
Edit: So, essentially, ignorance is bliss, but try to look past the pejorative nature of that phrase and take it for what it is without status implications.
As someone who just skims Hacker News and little else and no skin in the game, I always get the impression that Pichai is the weakest of the big tech CEOs, compared to Satya, Cook, etc.
Is my impression correct? Or it’s just that the anti-Google sentiment is strong in HN?
No hes bad. Very good politician at Google, did some interesting moves with Chrome a long time back. Not a visionary, and they are afraid of ai overtaking google
Sundar is a profitability machine. Google is also an order of magnitude larger than OpenAI. I don’t want all my orders drunk tweeting their thoughts to me. Apple doesn’t say shit but look at what they have achieved.
It depends. Microsoft is the most valuable company in the world and they don’t have any recent “hits”. They just keep doing their core business well just like Google does. That being said, all the research for all of this AI renaissance has come directly out of Google.
I don't understand how Apple is hitting home runs. What have they really innovated on post Steve Jobs? Their products are pretty much equivalent to the competition with 5% more polish at the cost of 5% more time to release. Marketing wise, they are close to gods, but innovation wise, even Microsoft is better.
I definitely agree on the fact that Tim is a much better CEO than Sundar. However I consider Satya to be much better than Tim.
Gargantuan achievements in two different spaces. 10 million tokens means insane things. Things like feeding the entire codebase of a massive site and saying make a copy of this with these changes.
Gemini is catching up, so OpenAI needs a new venue to market itself to the investors. It is doing a soft pivoting if you ask me, now GPT4 is like not that special anymore.
On the other hand, Video to google is much less relevant than text. But if OpenAI figuring out something from it to AGI, that would be a different story.
Youtube? Someone's going to make a tiktok like quick-feedback thing of purely generated stuff that learns what you like and tailors the generations to you, and, despite Google owning Youtube, OpenAI looks far closer to it than them.
Youtube is a video hosting platform, its advantage is in video delivery and ads. Why would a video generation software disrupts business?
Creating realistic video isn't hard even today, you can just do it on your phone and creating hours, hours of cat/dog videos. The hard part is to find a story to make it interesting. It could be possible in the future, like automatic film making, from script to realization, but that doesn't make YouTube's business go away either.
I would much rather pay to generate my own realistic videos based on my prompts than watch other people’s random creations (possibly filled with ads). When generation becomes great the motivation and need to store, retrieve and serve becomes less relevant.
This is all very impressive. I can't help to wonder though. How is text-to-video going to benefit humanity? That's what OpenAI is supposedly about, right?
We'll get some groundbreaking film content out of this in the hands of a few talented creatives, and a vast ocean of mediocre content from the hands of talentless people who know how to type. What's the benefit to humanity, concretely?
> Sora serves as a foundation for models that can understand and simulate the real world, a capability we believe will be an important milestone for achieving AGI.
For models to interact with real-world objects, they first need to understand those objects. These videos demonstrate just how advanced that awareness is. The goal is not to generate videos. Of course, they could and likely will build products on this capability, but the long-term goal is bigger.
Sure, if that's not just marketing. I haven't seen enough evidence to conclude this will go towards that kind of thing yet, but I'm open to the possibility.
They can probably reverse engineer this to build a multi-modal GPT that is fed video and understands what is going on. That's how you get "smart" robots. Active scene understanding via the video modality + conversational capabilities via the text/audio modality.
I'm not quite sure what you mean, so I'll ask for clarification. Are you saying this technology can be channeled into fighting disease and death, or that the man hours and computational freed up by this technology can be channeled?
Yeah, this is a very real issue with a lot of Silicon Valley tech, unfortunately. They're perfecting the art of pretending everything is fine, I feel like.
Biologists, chemists, and researchers can be all automated and trained on a very big LLM that OpenAI eventually creates. Then, more cures to diseases and technological advances can be invented. This technology can soon run entire countries and emulate humanity / society.
I guess he might be generating 50 for each response and posting the best, but that would seem deliberately disingenuous which hasn't been openai's style.
even the worst is still orders of magnitude better than anything else.
Countdown to when studios licensing this for "unlimited" episodes of your favorite series.
There was Seinfeld "Nothing, Forever" AI parody, but once the models improve enough and are cheap enough to deploy, studios will license their content for real and just have endless seasons.
Or even custom episodes. Imagine if every episode of a TV show was unique to the viewer.
Custom AI commercials would be very interesting. Instead of seeing strangers enjoying the benefits of the product, it shows you. A car commercial would show you driving, etc.
Commercials and TV episodes could have a basic "story arc" and then completely customized to the viewer.
Think about the simpson's or something. Imagine that the story of the episodes were kept, but you could swap in the characters and locations. So for instance if you lived in Nashville TN, all the simpson's episodes could be generated to show the settings as Nashville instead of Springfield.
Then you could have the AI switch out the characters to be people you want. Maybe you want to replace Lisa with an AI Simpsons version of you. Mayor Quimby with Nashville's actual mayor, etc.
> Custom AI commercials would be very interesting. Instead of seeing strangers enjoying the benefits of the product, it shows you. A car commercial would show you driving, etc.
I think it'd kind of defeat the point - I can't imagine a person that'd want their likenesses to be used to market to them. It'd be a disaster. Setting swaps are more realistic, though at the point where things get good enough for that to be possible, we may just see completely on-demand newly generated media instead of modifying what already exists.
If I saw myself onscreen telling myself to buy a product I've never seen or used: I would not buy that product or use that service. It feels violating to have your image used against your best interests (of not being manipulated to be capitalism's bitch) like that.
That is a hell-scape (to me).
Inserting yourself into shows... that's feels different, but my gut tells me advertisers will corrupt that idea quickly. Product placement...
Could you show any example of that pipeline? I'm trying to think about technology not using which would result in being cancelled, but can't come with anything
That's entirely orthogonal to the issue it was addressing.
The point is that it doesn't matter how close the two can become (indeed, we're already pretty much there); people will always want to read stuff written by actual people (or at least a thinking being) than something purely generated by a model with no other grounding in reality.
One understated aspect of AI Seinfeld is that it took many steps to differentiate it from the actual Seinfeld and create its own identity, such as the 144p visual filter and the random microwave. Those tweaks added to its charm.
If someone tried to do AI Seinfeld again in 2024, many would criticze it for not being realistic enough now that the tools to do so are now available.
I assume you would still be able to do that, just better? Like pixel art. Super Mario Bros. 3 look great despite being 36 years old. Contrast this with 3D games for the original PlayStation that have aged poorly.
I'm not sure there would be much demand for purely custom/individualized episodes beyond the novelty and maybe for fun with a group of friends. Most of the reason people watch TV or movies is for the shared experience that you can discuss with others. It could definitely drive down production costs though, hopefully HBO uses it to eventually redo Game of Thrones post season 4
Well there is always your AI girlfriend and AI friend group with the AI generated podcast breaking down the episode. (jk, sort of)
> Most of the reason people watch TV or movies is for the shared experience that you can discuss with others
I wouldn't say that. Most of the reason people watch TV is to kill time.
To be honest, I find my discussions with friends about TV shows on the decline just because of the fact that everyone is watching there own thing. So many shows and people watch them at their own pace. so most of the discussions go like this "Hey have you seen that new Netlix show X?" "No I haven't, maybe I'll check it out". Or "Oh yeah, i saw that a year ago, Its good but I don't remember the details".
Before Streaming when you had a set schedule for TV, it was way easier to discuss things because people were forced to watch programs on a certain day and there was more limited content. This led to "water cooler" conversations about what the previous nights show.
I bet if you graphed (discussions had about tv shows) / (hours watched of tv shows) that graph would trend down.
Think about little kids. My niece watches cocomelon all day long. She doesn't need to discuss it with anybody. She just wants an unlimited stream.
> I wouldn't say that. Most of the reason people watch TV is to kill time.
How annoying to see something amazing and then not be able to find anyone who also experienced it that you can ... what word mean's commiserate but in a positive way?
I'm thinking now about the astronauts that walked on the Moon and had only the few others. I think one of the astronauts bemoaned having gone to this amazing place, like some kind of wild vacation, but not being able ever to return.
You can just talk to your AI companion about it. If you involve another human there's always a chance somebody might be slightly bored or inconvenienced, so we want to avoid that.
Same about music. In good ol' days, one would meet a friend to listen to cool new music together, share CDs with mp3s etc
It's actually really weird. I wanted to buy my niece some CDs for Christmas to discover 90s music, but kids don't listen music from CDs anymore. They don't have devices even. Should I buy her a Spotify gift card and send her links to Spotify via Whatsapp? It's so strange.
Indeed. That is why in our family we watch broadcast or timeshifted tv and no netflix. Still it is hard to find other families like that so little tv stuff to talk about at work during lunch.
That would not work because that's now how people work. People watch/play media to connect to others. How can you talk about anything to anyone or have any shared culture when other people will never see what you see?
Movies, books, games, are a collective culture, not an individualist one. I don't know about you, but when I like an experience, I want to share it with others.
To be blunt about it, I can't help but imagine that the people who make such comments (and I've seen quite a few recently) are just complete philistines. They're the same people who can't draw, write, play, sing, design, or anything else and yet think they know what's good.
It's almost as if they think the purpose of art or entertainment is to stimulate some particular part of the brain and everything else between that and the screen/speakers/canvas/whatever is just an inconvenience that ought to be dispensed with as soon as technology allows.
> Imagine if every episode of a TV show was unique to the viewer.
This is the bit I don't think will happen, at least in big quantities. Half the fun of watching a popular series is being able to discuss it with epople afterwards!
Absolutely insane. It's very odd where the glitches happen. Did anyone else notice in the "stylish woman ... Tokyo" clip how her legs skip-hop and then cross at 0:30 in a physically impossible way. Everything else about the clip seems so realistic, yet this is where it trips up?
She's also wearing a different jacket at the end of the video. Continuity is not maintained when the video zooms back out to a wider shot after the close-up on her face. See, e.g., no zipper on end jacket and obvious zipper on jacket earlier in the video, or placement of the silver "buttons" and general structure of the lapels.
The background details are particularly "slippery" in these videos. E.g., in the initial video of walking along a snowy street in Japan, characters on the left just sort of merge into/out of existence. It's impressive locally, but the global structure and ability to paint in finer-grained details in a physically plausible way fails similarly to current image gen models, but more noticeably with the added temporal dimension.
And the cat that wakes up the woman in bed, has three front paws! And that woman seems to be wearing the blanket as though they were pyjamas. Still, it's usually very hard to notice the inconsistencies -- just like the subtle inconsistencies we might see in our dreams.
Yes, there's some really weird hand-blanket morphing going on in that cat shot. Similarly in the guy reading a book on a cloud, the pages flip in a physically impossible way at one point.
I just think it's perplexing how they got things so right, yet so wrong. How did they implement this?!
I'm not sure about others, but I'm extremely unnerved about how OpenAI just throws these innovations out with zero foreshadowing - it's crazy how the world's potentially most life-changing company operates with the secrecy of a black military program.
I really wonder what's going to come out of the company and on what timeline.
This is both amazing and saddening to me. All our cultural legacy is being fed into a monstrous machine that gives no attribution to the original content with which it was fed, and so the creative industry seems to be in great danger.
Creativity being automated while humans are forced to perform menial tasks for minimum wage doesn't seem like a great future and the geriatric political class has absolutely no clue how to manage the situation.
None of the examples you’ve given are even remotely the same thing.
> The artist that painted Mona Lisa didn't credit any of the influences and inspirations that they had.
This is not “influence and inspiration”, this is companies feeding other people’s work into a commercial product which they sell access to. The product would be useless without other people’s work, therefore they should be compensated.
> Just as cameras made many artists redundant, so too will every other new tool, and not just artist but pretty much every job.
The camera enabled something that was not possible before, and I wasn’t built by taking the work of sketch artists and painters. It was an entirely new form of art and media.
The only thing this stuff revolutionises is new ways to not pay people. I find the implications deeply depressing.
> This is not “influence and inspiration”, this is companies feeding other people’s work into a commercial product which they sell access to. The product would be useless without other people’s work, therefore they should be compensated.
How else do you get influence and inspiration without feeding other people's work into your own brain? Do you know a single artist, writer, or musician who hasn't seen other artists' paintings, read other writers' books, or listened to other musician's music? Ingesting content is the core of how influence, inspiration, and learning work.
> The camera enabled something that was not possible before… The only thing this stuff revolutionises is new ways to not pay people.
It's never been possible to generate thoughts, writing, and images so quickly and at such a high level. It's made creative pursuits accessible to billions who previously didn't have the skill or time to do them well, or the money to hire others. As a random example, I have friends using ChatGPT to compose creative and personalized poems and notes about each other. Not something they were doing before.
> The only thing this stuff revolutionises is new ways to not pay people.
The camera lessened the need of people to go to plays and pay for tickets to see things in person. Just like records, CDs, and mp3s lessened the need to go to concerts and shows. Technology is always creating and destroying ways to pay people. The ways that people get paid are not suppose to be fixed and unchanging in time.
> How else do you get influence and inspiration without feeding other people's work into your own brain? Do you know a single artist, writer, or musician who hasn't seen other artists' paintings, read other writers' books, or listened to other musician's music? Ingesting content is the core of how influence, inspiration, and learning work
I am a human, alive and sentient. I can be held responsible if my “inspirations” stray into theft. A machine cannot, and it’s increasingly looking like the companies that operate the machines can’t either.
I also can’t churn out my inspired works at a rate that displaces potentially everyone who has ever influenced me.
> It's made creative pursuits accessible to billions who previously didn't have the skill or time to do them well, or the money to hire others. As a random example, I have friends using ChatGPT to compose creative and personalized poems and notes about each other. Not something they were doing before
How on earth is using a machine to spit out a poem a creative pursuit? There’s no more creativity there than watching a movie someone else made. It’s entertaining, yes, but it’s not creativity.
> The camera lessened the need of people to go to plays and pay for tickets to see things in person. Just like records, CDs, and mp3s lessened the need to go to concerts and shows
This doesn’t hold water. Cinema did not eliminate theatre just as records did not eliminate live music. In fact, both are arguably as big now as they have ever been. The technology here filled a new space, it didn’t threaten to throw everyone out of an existing one.
I can't know if you've actually used these tools, but it requires a pretty high level of creative mind to get them to produce the content you're looking for. Maybe you as a user of an LLM you don't need to be creative in the writing of words for example, but you instead need to be creative in how you control the tools and pick the right outputs, feed it back, copy/paste/cut it, change stuff, extend it.. and the same with the image generators. There's a HUGE amount of creative accessories around them to manipulate and steer the process. There might be less creativity needed with the pen, but it's needed in other ways.
I don’t see the advent of generative art any different than when we moved from paper to photoshop.
For those unaware the vast majority of graphic artists start their projects with assets and base images that they themselves don’t create. With generative ai you’re simply going one step further and have another new tool create a more polished version that you can edit to remove extra fingers, etc. It’s simply moving the baseline from 20% done to 60% done, which will result in artists producing even higher fidelity and more detailed art.
For example an artist could generate a bunch of scenes using Sora and create a collage of them for a larger piece of art, something that is prohibitively time consuming right now.
> I also can’t churn out my inspired works at a rate that displaces potentially everyone who has ever influenced me.
I'm with you, man. I'm still trying to find a lawyer who will sue Kubota and John Deere for moving dirt at a rate far superior to me and a shovel, but nobody will take my case.
> How on earth is using a machine to spit out a poem a creative pursuit?
100%, man. Nobody is mentioning the magical fairy dust in human brains that makes us superior to these models. When I really like fantasy novels, and then train my neurons on thousands of hours of reading Tolkien, Terry Brooks, Brandon Sanderson, etc, and then I get the idea to write my own fantasy series, my creative process doesn't draw on my own model's training data at all. It's 100% "creative", and I would produce exactly the same content if I were illiterate. But these goddamned machines, man. They don't have our special human fairy dust.
When we discovered the universal law of gravitation, and realized that the laws of physics are omnipresent in our universe, we put a giant asterisk to note that the laws of physics are different inside humans. The epidermis is a sort of barrier to physics, and within its confines, magic happens, that these pro-AI people conveniently "forget".
To paraphrase the eminent Human Unique Creative Person Roger Penrose: "There's magical quantum shit goin down in the microtubules. It's gotta be the microtubules. I think, right? I can't prove it, but as a scientist, we don't need proof. Making sure we think we are superior is more important."
> I am a human, alive and sentient. I can be held responsible if my “inspirations” stray into theft. A machine cannot, and it’s increasingly looking like the companies that operate the machines can’t either.
150 years ago, Bertha Benz wasn't allowed to own property or patents in her own right, because the law said so.
The specific reason a machine cannot be held responsible today is because the law says so.
Also, dead humans' copyright is respected in law, so "alive" isn't adding value to your argument here.
> I also can’t churn out my inspired works at a rate that displaces potentially everyone who has ever influenced me.
I can't run faster than every athlete who has ever inspired me, this argument does not prevent motor cars.
I can't write notes faster than the world record holder in shorthand, this argument does not prevent the printing press.
I can't play chess or go at even a mediocre level, this argument does not prevent Stockfish or Alpha Go.
I can't hear the tonal differences in Chinese well enough to distinguish "hello" from "mud trench", 这个论点并没有阻止谷歌翻译学习 “你好” 和 “泥壕” 之间的区别。
I can't do arithmetic in my head faster than literally all other humans combined even if they hadn't been trained to the level of the current world record holder, this argument does not prevent the original model of the Raspberry Pi Zero.
"The machine is 'better', in one or more senses of the word, than a human" is, in fact, a reason to use the machine. It's the reason to use a machine. It's why the machine is an economic threat — but you can't just use "my income is threatened by this machine" as a reason to prevent other people using the machine, just as I as a software developer can't use that argument to stop other people using LLMs to write code without hiring me.
> Cinema did not eliminate theatre just as records did not eliminate live music. In fact, both are arguably as big now as they have ever been.
You can argue that, but you'd be wrong.
Shakespeare wrote for normal everyday people, his stuff fit into the category that today would be "TV soap opera", where the audience was everyone rather than just the well-off, where the only other public entertainment was options were bear-baiting and public executions, where the actors have very little time to rehearse, and where "you're ripping off my ideas" was handled by rapidly churning out new content.
Live music, without amplification, used to be the only way to listen to music. Now, even if you see a live performance, you can have 10k people in a single venue listening to a single band… and if you want music in a pub or a dance club, the most likely performance is from a DJ rather than a band, and the "D" stands for "disk" because the actual content is pre-recorded — and that's not to say I would deny that DJ work is "creative", but rather that it makes DJing exactly what critics accuse GenAI of being, remixing of other people's work.
Which, now I think about it, is a description that would also apply to all the modern performances of Shakespeare: simply reusing someone else's creation without paying any compensation to the estate.
But I know that will tickle you the wrong way, I know that art is the peacock's tail of humans: the struggle, the difficulty, is the point, and it has to be because that's how we find people to start families with. Because of that, GenAI is like being caught wearing a fake Rolex watch, and you can't actually defend that with logical reasons such as "real Rolex watches aren't very good at keeping time compared to even a Casio F-91W let alone the atomic clock synchronising with my phone", because logic isn't the point, and never was the point.
Reading your opinion on the subject, I believe you’re struggling to make sense of what is happening. I suspect there is a combination of factors here: you are reinforcing a bias, can’t wrap your head around it, don’t have much experience working with AI, haven’t deeply considered the evolution of the universe.
My recommendation: zoom out a little bit. Every step in history is so brief and nothing is normal for long. Even humanity is a blink.
Comments like: “how is using a machine to spit out a poem creative”. Really? How is using a digital camera creative compared to painting. How is a painting creative compared to etching? And on and on evolution goes..
> Reading your opinion on the subject, I believe you’re struggling to make sense of what is happening. I suspect there is a combination of factors here: you are reinforcing a bias, can’t wrap your head around it, don’t have much experience working with AI, haven’t deeply considered the evolution of the universe.
I agree. I could have expressed my thoughts better in this case. It wasn’t just OP I was considering. I was thinking of a common AI take that I’ve seen when I wrote my comment. Regardless, will do better to express my thoughts and agree that we shouldn’t profile each other here.
- The rate of change that AI forces upon us has never before been experienced.
- The scale of these changes is nothing like we've ever seen before.
The adoptions of the camera, radio, automobile, TV, etc., didn't happen practically overnight. Society had a good decade+ to prepare for them.
Similarly, AI doesn't just change one industry. It fundamentally changes _all_ industries, and brings up some fundamental questions about the meaning of intelligence and our place in the universe.
My fear is that we're not prepared for either of these things. We're not even certain how exactly this will affect us, or where this is actually all taking us, but somehow a very small group of people is inevitably forcing this on all of us.
Because of this I think that being conservative, and maybe putting some strict regulation on these advancements, might not be such a bad idea.
Agree with what you are saying as well. But AI is not displacing at the rate of change that is advancing. True, we hear anecdotes about people losing their jobs in HN, that was happening when those other adoptions happened but we didn't know about it happening real-time.
Humans still need to adapt and we are slow. If singularity is near [it isn't] we can be afraid, until then we are the limiting factor here. Displacement will happen but growth will happen faster with these new tools
Because as I grow older, I find I am less and less equipped to keep up with the rate of change that we are undergoing. It also means a lot of uncertainty for the immediate future. If AI takes over my job, will I still be able to compete in some industry somewhere and provide for myself?
I don't want much out of life, but I do want the ability to influence my own personal situation. If we wind up in the UBI-ified, dense urban housing future where AI does all the work and no one owns anything, how much real influence will I have over my life?
Will I live out my days in a government issued single bedroom apartment, with a monthly "congratulations for being human" allowance from the government? I don't want that. People say it will free us up to pursue whatever we want, but to me it sounds like the worst cage imaginable. All the free time, and no real freedom to enjoy it with.
Because make no mistake. If you live on handouts from your government, you aren't free.
So with that as a potential, maybe even likely outcome, why aren't you afraid of change?
I think the question is more along the lines of "will your government continue to pay your social security if you don't remain living in the country", not "can you deposit it somewhere else"
Also, how about if you get into trouble. If you're arrested for a crime (even if eventually found not guilty), will you continue to receive social security?
Is there any circumstances where your government could refuse to continue paying it?
And most importantly: could your government invent such a circumstance in the future, and then invoke the new circumstance to deny you the payment?
Living on government money reminds me of my cat. She relies on me to feed her and provide for her, and I do happily take good care of her because I love her very much.
1. My government will continue to pay my Social Security if I don't remain living in the country. My father emigrated from the U.S. to Israel after he retired and he continued to receive his Social Security for about 20 years, until the day he died.
2. "Also, how about if you get into trouble. If you're arrested for a crime (even if eventually found not guilty), will you continue to receive social security?"
"If you receive Social Security, we'll suspend your benefits if you're convicted of a criminal offense and sentenced to jail or prison for more than 30 continuous days. We can reinstate your benefits starting with the month following the month of your release." — Social Security Administration
3. "Is there any circumstances where your government could refuse to continue paying it?"
If it goes broke, certainly.
4. And most importantly: could your government invent such a circumstance in the future, and then invoke the new circumstance to deny you the payment?"
> Because as I grow older, I find I am less and less equipped to keep up with the rate of change that we are undergoing. It also means a lot of uncertainty for the immediate future. If AI takes over my job, will I still be able to compete in some industry somewhere and provide for myself?
I understand this fear, and sympathise with it even though I have multiple income streams.
> I don't want much out of life, but I do want the ability to influence my own personal situation. If we wind up in the UBI-ified, dense urban housing future where AI does all the work and no one owns anything, how much real influence will I have over my life?
Why do you fear "dense" urban housing future? I think most people choose relatively dense environments because that's where all the stuff they want is, but rural areas are cheaper[0], and the kind of future where humans must live on UBI due to lack of economic opportunity is necessarily one where robots do the manual labor such as house building and civil engineering, not just the intellectual jobs like architecture and practicing real estate law.
Likewise, while I can see several possible futures where nobody owns stuff, the tech to make it happen is necessarily also good enough that any random philanthropist who owns just one tiny autofac would find it trivial to give everyone their own personal autofac — "my first wish is infinite wishes" except the magic gene doesn't say "no".
[0] The only reason I'm looking to get somewhere a bit more rural is that the sound insulation in my current place is failing, and I'm right by a busy junction with multiple emergency vehicles passing each day — and the more less built-up areas are the cheap ones. Still the biggest city in Europe, but I'll be surrounded by forest and lakes on most sides within 15 minutes' walk.
Because I hated living in Apartments when I lived in them. They are noisy and small, and I like quiet and space. For me, being closer to walk to stuff is not really appealing enough to deal with how awful the experience of living in dense housing is.
I strongly think that dense housing is only positive for people who don't spend much time at home.
> "my first wish is infinite wishes" except the magic gene doesn't say "no"
The problem with this is that we haven't actually solved resource scarcity, and until we do there is still going to be an upper limit to what you will be allowed to buy, controlled by the number printed on your UBI cheque. I am anticipating this number to be much lower than what I currently am capable of achieving in my career.
Of course this is the fear that my career won't exist in the future. Or simply that AI will eat enough jobs that I will be edged out by better human competition. I'm under no illusions that I'm near the top of my field, I am firmly in the middle of the pack at best.
> sound insulation in my current place is failing
The sound insulation in the apartments I've lived in was nonexistent. This is a big part of why I never want to do that again.
> Because I hated living in Apartments when I lived in them.
I meant more along the lines: why do you expect that to be the future, such that you have reason to fear it?
> The problem with this is that we haven't actually solved resource scarcity, and until we do there is still going to be an upper limit to what you will be allowed to buy
Yes, but the AI necessary to make human labour redundant is that tech. In the absence of that tech, humans could still get jobs doing whatever the stuff is that AI can't do.
> why do you expect that to be the future, such that you have reason to fear it?
Because if I don't have an income I don't expect to be able to afford anything bigger.
> In the absence of that tech, humans could still get jobs doing whatever the stuff is that AI can't do
Which will be manual tasks that I am aging out of being able to keep up with, or.. what? Stuff that traditionally doesn't pay as well as knowledge work, right? And may not pay much more than the UBI anyways?
> Because if I don't have an income I don't expect to be able to afford anything bigger.
A big rural place is cheaper than a tiny city place.
> Which will be manual tasks that I am aging out of being able to keep up with, or.. what?
Automation started with the manual stuff, well before computers were invented. Even for humanoid robots, their hardware is better than our bodies, and it's the software which keeps it from replacing specific workers, though telepresence is one way around that.
> I don't want much out of life, but I do want the ability to influence my own personal situation.
We are still animals in the animal kingdom. It’s survival of the fittest as long as resources are not infinite. You can never expect this luxury. You are predator or prey.
>Because make no mistake. If you live on handouts from your government, you aren't free.
This isn't actually the problem since we need and will continue to need UBI for non-AI related reasons
>People say it will free us up to pursue whatever we want, but to me it sounds like the worst cage imaginable.
This is where you missed the bit that "pursue whatever we want" will also be limited by AI, and secondary effect of people growing up consuming and enjoying AI productions that tailored to their interest. At best, you'll have a few people commanding Patreons who have some skill, but generally you'd have to find a domain to pursue that isn't already automated. Luddite subcultures will have to develop. But generally you yourself and most others, particularly children of millennials who'll grow up with this stuff progressing in sophistication, might just spend your time watching your video prompts come alive; and who would wanna. do anything else when you can get straight to what you wanna see.
> we need and will continue to need UBI for non-AI related reasons
This mentality is why bitcoin is going to cruise through 1 million dollars a bitcoin and on and on. Print Monopoly money and people who earn will keep seeking out sound money.
Hint: the money comes from redistribution, not blindly printing more, the latter would obviously be completely insane (which is why you'd rather argue that scenario) whereas the former would keep the economy going, which is obviously in the interest of the capitalist class. No point owning and producing if there's no buyer because everyone is starving.
What you seem to think would devalue money will be the very thing that keeps it going as a concept.
And I hope you understand somewhere deep down that Bitcoin is the epitome of monopoly money.
I see it as the polar opposite, backed by math. A politically controlled money supply with no immutable math-based proof of its release schedule is Monopoly money. Cuck bucks. Look at the 100 year buying power chart.
On your second point, in spirit I agree. You need a stable society to enjoy wealth so it’s in the ruling classes best interest to keep things under control. HOW to keep things under control is the real debate.
That's what makes it bad. A fixed algorithm that soon will spawn pittances would do an utterly miserable job if it ever gained status and usage as actual currency. Deflation is bad. So much worse than inflation. Not having flexibility in the money supply is lunacy.
Mild inflation resulting in 100 year buying power going to fuck-all is good. It forces money to be invested, put to work. If sitting on your stash is its own investment the economy is screwed. Reduced circulation means less business means less value added and generally more friction. Why would you want that?
Crypto does some things well (illegal stuff, escaping currency controls/moving lots of money "with you") but in the end that also requires it is only just big enough for reasonable liquidity, but not so big it has an impact on the actual economy. For what it's being pushed for... it's a negative-sum game only good for taking people for a ride. It should stay in its goddamn lane.
All money is politically controlled, including Bitcoin (although it's debatable if Bitcoin even counts as money). The politics of Bitcoin are one-op-one-vote rather than one-man-one-vote, but it's still there, and it's still mutable if enough of them cast their votes in any given way.
Da Vinci also made money from the painting, and the Louvres continues to do so right now. They didn't credit his influence and inspiration. This is not sad.
The camera did enable painters to pretend they were, for hours, at a scene they painted, but instead they painted photographs from others. Artists are not angels, they do the same "bad" things than OpenAI
In what way does anyone have a monopoly on generated images and video? Last I checked there were several major players and more startups than you can shake a stick at.
It won’t last. There’s a massive incentive to build more GPUs and develop specialized chips and everyone who can is scrambling to meet that demand. The technology is not some trade secret that no one can copy which is why there are so many people and companies diving into this market now. Hardware is a bit slow to ramp up production of but it will get there eventually because there’s money to be made.
Does that matter when the models they generate are given away for free?
You can make your argument validly against DALL•E or Midjourney families, but we've also got the Stable Diffusion family of models that anyone can just grab a copy of.
I’m talking about generative ai VS human artists. But in this case it seems like OpenAI specifically has a massive leap over everyone else with this video generation. So whether they have a monopoly over that remains to be seen.
What does not remain to be seen though is that generative ai is going to put a lot of artists out of work.
You can argue about the good and bad of that but it’s defo happening.
So at what point is a painter too effective to be legal? Should we limit the amount of paintings that a single painter is allowed to produce per month?
Not sure if you’re just being facetious but my point is that individual painters do not need to have limits on them because they have a natural human limit that stops them causing societal problems.
What if da Vinci had been superhuman and could take on 1,000,000 commissions per day and had also taught himself every style of art and would do each commission for 0.001x the cost of anyone else.
Yes society as a whole benefit from a fantastic amount of super high quality art.
But the other artists are not gonna be so happy with the situation are they?
People make decisions based on what society deems valuable. That changes over time and has for the entirety of human history.
Maybe there’s a demand for more customized art. Maybe spite patronage will make a comeback.
Anyone telling you they know how it will shake out is a fraud. But the incentives we’ve set up have a natural push and pull to get people to do what society values.
It's funny all you guys arguing there isn't a right/law to make money from art. What do you think copyright is? The issue is that all these models were trained in blatant violation of copyright. And before you say they just take inspiration, that's the same argument as saying when I copy a movie to my harddrive it's the same remembering. It's not and a computer is not a human.
Da Vinci inspired whole new generations of artists, thinkers and scientists. The net benefit of his existence distributed itself among many others - as it does with any great artist, thinker or scientist. It certainly looks like generative AI has at least in some cases the opposite effect.
> into a commercial product which they sell access to
Within a few mon the or years there will be open source implementations anyway, running locally or in a data center. Most of the technology is published.
Contrary to text and the big piles of "liberated" data hanging around for anyone looking hard enough to grab, the training data for video seems to be harder to access for opensource / research / individuals. Google has Youtube, OpenAI can pay whatever fee any proprietary data bank requires. There's a moat right there that I can't see how to overcome.
Weird to say I guess, but meta might release an open source model too. And they do have plenty of data to feed their models. Arguably more data than openAI should have as they don't really own any social media.
Thing is, anyway, as soon as one model is open there will be copies of it, fine-tune implementations. People don't care that much about ownership of data I would say if they actually have access to the models that are produced by gathering this data.
Ultimately, to me, an open source model for this tool makes a lot of sense. They use publicly available data and the models become publicly available.
I for one am quite excited for this tooling to become better and better so I can make the adaptation of a book I love into a movie I imagine it can be. At least I can have a lot of fun trying.
> This is not “influence and inspiration”, this is companies feeding other people’s work into a commercial product which they sell access to. The product would be useless without other people’s work, therefore they should be compensated.
Sure.
Who do we send the compensation to for Leonardo da Vinci? Or Shakespeare, for a text-based example?
Do you want them to compensate me for the stuff I uploaded to Wikipedia and licensed as public domain, or what I've uploaded to GitHub with an MIT license?
A model trained only on licensed data is still an existential threat to the incomes of people whose works were never included in the model, precisely because they're only useful to the extent that they generalise beyond their own examples.
> The camera enabled something that was not possible before, and I wasn’t built by taking the work of sketch artists and painters. It was an entirely new form of art and media.
A new form of art that was (a) initially decried as "not art", and (b) which almost completely ended the economic value of portraiture.
> Who do we send the compensation to for Leonardo da Vinci? Or Shakespeare, for a text-based example?
Those authors aren't alive and their works are in the public domain. Bringing them up is irrelevant and a diversion from the actual problem, which is that creators alive today whose work is under copyright today and who need to make a living from their art are having it taken with zero compensation and had it fed into AI, stealing their effort.
> A model trained only on licensed data is still an existential threat to the incomes of people whose works were never included in the model, precisely because they're only useful to the extent that they generalise beyond their own examples.
Again, a diversion. We can debate how much AI trained on properly-licensed AI should be controlled, but it's pretty clear that the bare minimum is for all AI training data to require explicit permission from the creator of that data.
Let me clarify - you're not even misinterpreting my comment - you're just making up random things that I never said and which no reasonable person could ever draw from my words.
There's no point to arguing this further because you're clearly not acting in good faith. It is impossible to have a reasonable conversation with someone who randomly (and falsely) claims that others said things that they did not.
Those are not fundamentally different. A group of people coming together to create a company that trains a AI model for profit and an artist studying thousands of pieces to develop a style of their own, and then selling paintings based on that style, are both totally dependent on the body of knowledge that civilization left for them.
Artists do credit their teachers (Verrocchio in the case of da Vinci), schools, sources of inspiration and influences, so I'm confused by this comment.
What kind of acknowledgement did you have in mind?
if the producers of these models weren't incentivized to hide their training data it would be almost trivial to at least retrieve the images most similar to the content produced
some images will be maximally distant from training examples but midjourney repainting frames from "harry potter" could very easily automatically send a check to jk rowling per generation
these AI start ups are just trying to have a free lunch in a very mature industry
"The world doesn't work that way". Quite pessimistic a position to hold here, no? We–in technology especially–are in positions of significant leverage. We should be talking about how we can limit the negatives and bolster the positives from these generative models. The world can work in a different way if we put enough energy into it. We don't have to stand by as subjects of inertia. That is why OpenAI and others are treading carefully, trying to trigger some kind of momentum of reflection instead of letting our base demons run amok.
That's a massively charitable reading on their actions, whenever I see a "thought leader" behind these companies talk about how careful they are being, I just see marketing. Someone desperately trying to impress upon everyone how revolutionary their model and by extension they are, it's kind of sad..
I definitely see it as self-serving too, yes, but I also see it as a convenient temporary alignment of incentives. The world and its regulators definitely need time to adjust and educate themselves, so I'm glad for now that they're exercising restraint.
> The way these models are creative is the same way humans are
We have no idea how human creativity works, but we know with certainty that it doesn't involve a Python program sucking in pixel data and outputting statistical likelihoods.
You know, Ive seen people do amazing things with math equations. Beautiful visualisations.
As these tools improve and it becomes more possible for us to actually take our ideas into images and videos that fit a sort of "yes this is what I want" bill we are going to see amazing things come out.
I mean, a few days ago I saw this clearly AI generated video of some wizards doing snowboard and having a blast in the mountains. It's one of the funniest things I've seen in a while, simply so ridiculous. Obviously someone had the idea "I want to make a video of wizards doing snowboard in a mountain" that's where creativity lies.
So to say "creativity doesn't involve a python program outputting statistical likelihoods" imo is just you saying you're not creative enough to know what to do with the tools you've been given.
Some people when they see a strawberry they see a fruit. Others see endless dishes where the fruit is just an ingredient.
That's a meaningless statement. Any interacting physical system is an "input output" pattern, as long as you're only looking at the inputs and outputs. Behaviorism fell out of favor for a reason. It's whats transforming inputs and creating outputs that matters. For that matter, you need to be able to define what an input and an output is for humans, given that we have bodies.
I don't really want to weave baskets, that's what I'd want a machine to do.
"The world doesn't work that way" - I've seen this so often, but the most incredible thing about humans was the optimism to be able to change how the world worked -- that's the main impetus of most revolutions.
Personifying computer programs also is an error, it's like saying that bombs kill people when there has to be a person dropping them (at least until we get Skynet).
>I don't really want to weave baskets, that's what I'd want a machine to do.
In my free time I like to code games, I don't have money to pay for an artist, nor the time/will to learn how to draw, that's what I'd want a machine to do.
I do agree with you that personifying computer programs is an error. That's also why I avoid calling these AI, because they're FAR from that. But I do believe that there will come a day, where personifying a computer program will be a real question.
>The way these models are creative is the same way humans are. The artist that painted Mona Lisa didn't credit any of the influences and inspirations that they had.
I'm continually amazed at how many people argue against this point on HN, which is largely biased toward logical discourse. What you just said is exactly right, and is the Achilles heel of the legal arguments against generative AI. If what they are doing illegal, then so is the human act of creativity. If human creativity is legal, then so is generative AI trained on existing art.
What has yet to come is the mass realization (or perhaps, admission) that the way AI works is no different from the way we work.
my name is timothy basket -- you're saying people have stolen my weave?!
end sarcasm. but seriously -- claiming you made something you didn't isn't ok. but it happens, regardless of laws or regulations or norms.
i don't have any solutions; the internet helps because you can publish something and point to it. i'm a musician and sometimes i only realize well after the fact how influenced i was by something after the fact for a song i've written.
It is absolutely not the same, and saying so disregards centuries of knowledge stratification.
These machine produce superficial artifacts that lack any layering of meaning of semantic capital (see Luciano Floridi).
They are the byproduct of the engineering extremism and lack of humanities knowledge of the people getting rich through their creation.
Models learn exactly like artists, and also, for some reason, the person that uses those models are artists making art. Wait… Artists learn by passively ingesting many millions pieces of media someone feeds them for the non-specific purpose of “generating art” so some person who wants to take credit for making the end piece can tell them exactly what to make, right?
This reads like a wildly confident statement about art.
While at the same time not mentioning the actual name of "the artist that painted Mona Lisa" (Leonardo Da Vinci), nor knowing that the name of his master is very well known, and even the influence of artists that he seemed to despise (eg Michaelangelo) are very well documented as well.
Maaybe this narrow view of (art) history needs to be fine-tuned on more data :-)
The human world works that way humans make it work. Pretty much what Jody Foster's character in the movie Contact told that asshole trying to steal all the credit from her, and take her place in the mission to go visit alien dad in Pensacola.
“whose existence and names we will never know or acknowledge.”
That’s the problem. We know their names. We know their stories, their contributions. Babbage. Lovelace. Ritchie. Spielberg. Picasso. Rembrandt. This is what giving attribution is all about. So we don’t just stand there asking how we got here.
To the influences that they know. Our brain isn’t an attribution machine. When a musician recreates a chord progression that they’ve heard before without noticing it, is that theft?
If a comedian accidentally retells a joke, is that theft?
Your argument is similar to the classic hand vs. power tools argument in crafting, which eventually boils down to "did you mine the ore and forge the tools yourself?" Nowadays the argument is about CNC vs. hand crafting.
This is just a point in our overall evolution. It's an exciting time. We are here to learn and adapt.
Humans can still be creative all they want. There's still the stamp of "created by a human" that will never go away. You can choose to respect it or ignore it.
True, but while the 'best' chess is played by computers, few people care to watch Stockfish playing with itself. Meanwhile the human-powered chess world is enjoying a surge in popularity.
> centaurs (human+AI) in chess/go were better than either humans or AI just for a short time.
I was having a conversation about this with a friend last weekend, and we'd assumed that centaurs were still better than either top humans or top computers. I'm unable to easily find this info on google, could you share where you saw that centaurs are no longer better than top computers?
I saw it here, perhaps, in articles about competitions where both humans and computers were allowed (computers-only teams won). I too can’t find anything relevant on google.
I see a shallow analogy that isn't true to me on close inspection.
To me, human activities from which we can earn a living wage feels like nomadism as the edge of an ever expanding region of agriculture (technological automation in this case). When you lose some activities to automation, we've always found new ones until now. In the end though, there were no more pastures for nomads to move, and there will be no more new activities from which humans can earn a wage (not to mention the satisfaction of accomplishing something hard). And, while there might be a future with UBI for everyone, the transition seems rough and exploitative.
Most labor is being automated within the next few decades. It'll be a post-labor world with one less factor of production. Capital and land ownership is all that will matter assuming we don't completely redesign our economic and political system. Pretty scary.
My one hope is that the price of goods becomes so low due to AGI/automation, that the uselessness of labor in the economy won't matter. People can still be materially prosperous even with a meagre UBI (and it will be meagre because people have no political power in a post-labor society where the only thing that matters is capital).
>Capital and land ownership is all that will matter assuming we don't completely redesign our economic and political system. Pretty scary.
Agreed. My concern isn’t really remotely about any of the accomplishments of generative AI. Frankly in my daily life I’d welcome readily available access. As it stands now it’s sort of a mixture of analytics and creativity without consciousness as we best understand it, so GPT itself isn’t going to murder me and take over the world.
The real issue is who owns these things, how you access them, how effects will ripple through a labor based economy, and how we’ll adapt (or not) our current economic system. As it stands for awhile we’ve been catering to the capital ownership group. If that doesn’t have a change in direction then I fear the implications of much of this in daily life. There’s still a fair bit of specialization and domain knowledge needed to leverage these tools to understand the questions to ask (I.e prompts to generate both around LLMs and the context of information fed to them) but they can certainly in many cases behave as multipliers that could reduce the amount of staff needed in some creative roles or eliminate some all together.
This isnt a new dilemma as arguably technology has been shifting the labor market for centuries, the question is how and if it can reshape well this time or if we need to fundamentally rethink these concepts of labor and capital ownership. That’s my major concern.
> It's the opposite. Price of goods is becoming more and more expensive due to larger demand and lower salaries.
We're discussing a hypothetical post-labor future in 5-40 years. We probably shouldn't predict the economic theory of this future by looking at recent trends. Recent trends are driven by business-as-usual things like supply chain disruptions. But we're still near full employment, so we're not on the gradient to realized post-labor just yet. Post-labor economics (and politics) will probably be radically different, all the economic assumptions we take for granted go out the window.
Honestly, I don’t think the unemployment rate will change much. Humans are great at inventing things to do and if other people see those things as valuable they will pay for them. I do think the world will look very different, maybe even unrecognizable but it’s not going to be full of people doing nothing.
It’s too early to close the bets. Arts (I mean, drawn porn) was just the easiest to kickstart from all the tech that the invention of modern ML and GPUs will enable.
It doesn’t look the opposite, it looks it automated even what we all couldn’t think of, and did that first.
I disagree. I think this is going to empower creatives like never before. Filmmaking currently takes a huge amount of time and money. Countless would be filmmakers are relegated to making 30 second tiktoks because that's all they can afford to do. This could change all that.
Exactly this. Art changes over time. The mediums that we use to express ourselves creatively evolves. The position that AI is the end of creative art isn't taking this evolution into account.
When video became an affordable medium, would people say "this is the end of art, live performances are art. Now the people will just watch the same recordings over and over?" Maybe, if the internet existed. But it's had the effect of creating and introducing new art forms.
AI generated content won't replace art. It will evolve it to a new creative.
Today, only a highly privileged slice of the population can make a living making art. Nearly everyone who enjoys making art can't make a living off of it, and even the vast majority of people trying to do it full time still can't make ends meet (hence the cliche of the starving artist). But everyone can make art as a hobby if they'd like, that's what almost all artists do, and that will continue to be true as AI advances.
So I don't see AI art as changing careers much. Even if AI fully replaces human artists, all that means is the 0.1% of people who make a career off their art will have to join the rest of us 99.9% who only do art for the fun of it.
You sound like "making art" is only the painter in his Brooklyn studio. But it's video game designers, movie animators, videographers, graphic artists, and more that work in agencies and marketing departments of all companies that will be affected.
Those are mostly not well paid roles[0], and there are clearly many hobbyists in these areas also — looking at YouTube, all output is necessarily videography or animation, but what's the income distribution? I have a channel, no money from it (not that this was ever the point).
[0] Unless you're doing furry art, but that's only because furries are "suspiciously wealthy".
> Today, only a highly privileged slice of the population can make a living making art.
I think this is less true than it's been in centuries or perhaps all of history. Artistry is widespread, anyone can do it, and many choose to pursue it even though the pay isn't going to be great; in preindustrial times even having access to the ability to create art was quite limited as were the media types that existed.
Haven't we always attributed creations to people, to motivate our own egos to pursue higher achievements in the name if "glory"? With vision of wealth attributed to fame? Forgive me for being cynical here, but this is how I always viewed the world. Names are... just this, names. Things we use to communicate some ideas/phenomena, but are irrelevant in scope of endless evolution. And can function just aswell with some other "identifier" attached to it.
I have come to terms with the fact, that I'm just a spit of sand, just as irrelevant to my own creation, as I am to the cells and bacteria that create me.
I suspect chasing glory is the main driver yes, but we also like to understand how things came to be, and by knowing who made them and when and where we can do that. AI is ushering in a dark age of attribution where we may no longer be able to know how anything came to be after it's spit out of a computer. (I mean dark age as in "it's dark and we can't see," like the Greek Dark Ages or Dark Matter, not in the sense of "times are bad".)
As said every time this "why are we automating creativity when menial jobs exist?" response comes up:
1) Errors in art programs messing up is less worrisome than a physical robot. One going wrong makes extra fingers in a picture, the other potentially maims or kills you.
2) Moravec's Paradox. Reasoning requires little computation versus sensorimotor and perception.
3) Despite 1 and 2, we are constantly automating menial jobs!
Classifying image generation and manipulation as "art programs" is the most beneficial possible reading of it. When you use them to generate disinformation, incitement and propaganda, they are potentially maiming and killing humans. This failure mode is well known, the mitigations ineffective, yet here we are, about to take another leap forward after a performative period of "red teaming" where some mitigation work happens but the harsher criticism is brushed off as paranoiac.
I couldn't disagree more strongly that disinformation, incitement, or propaganda maim and kill people. People kill other people. Don't give killers an avenue to abdicate responsibility for their actions. Propaganda doesn't cause anyone to do anything. It may convince them, but those are entirely separate things with a clear, bright line between them. Best not mix them up.
It might be instructive to consider for example the history of genocide, in particular of civilian collaboration in state lead genocide. It might be instructive to consider why the genocide convention criminalizes not only acts of genocide, but also incitement of genocide. Why it criminalizes not only the failure to prevent genocide, but also the failure to prevent incitement of genocide. The US has an extraordinarily strong position on freedom of speech, it is nowhere near a universal moral value.
People kill other people is a statement so simple as to be devoid of any positive meaning. What are you actually trying to say? Don't justice systems almost universally contain notions of incitement of crime, criminal negligence to prevent a crime, and other accessory considerations to the actual act?
Don't justice systems almost universally have several levels of responsibility in relation to intent, which at its most basic level can be established by predictable outcomes?
If, for example, you are a leader of armed forces, and also a leader of organizations capable of creating propaganda. Let's say you create and distribute some propaganda (maybe using some AI tools), and a predictable outcome of that is that soldiers will be more lenient in their consideration of the rules of engagement and international law. In that case, one could at the very least establish that you were negligent in your creation and distribution of propaganda. The actual crime would have been the people killing people, namely your soldiers, but you would certainly be given some responsibility for that.
You can similarly take a small next step after that and consider that a company producing, distributing, and profiting from a dual use technology capable of creating propaganda and disinformation that can be responsible for crimes could be held at the very least morally accountable for those crimes, if not criminally.
Responsibility, accountability, moral and criminal, are not black and white notions. They are heaviest and easiest to attribute around physical acts of damage, but they stretch far and wide. To think otherwise is to allow the people with the most power to rampage unaccounted.
> freedom of speech .. is nowhere near a universal moral value
It depends on the basis form which you derive your (universal) moral values. Maximalist liberty as a universal moral value can be derived from the dual axioms of universal moral equality and a lack of moral oracle. If you accept these axioms, it follows that there is no source of moral authority that can legitimately constrain the non-infringing actions of another (eg. your right to wave your fists around ends where my nose begins). These ideas were first laid out in The Declaration of the Rights of Man, and expanded on in the Declaration of Independence.
> What are you actually trying to say?
That the causal chain of an action is completely interrupted at the first agent/actor in the system, who bears full responsibility for their actions.
> justice systems almost universally
It very much depends on the justice system. If you look at US/British/Roman law, a guilty mind (mens rea) and a guilty action (actus reus) are core facts that must be established in order to prove a crime has been committed. These still apply in cases of eg. criminal negligence, where a reasonable person ought to have known that their actions will result in harm. Mens rea is quite challenging to prove in cases of incitement, and legal precedents vary.
In combination with the above causal thesis, I hold that restricting incitement is in all cases an overstep of federal authority and an infringement of fundamental liberty. Incitement as a crime seems to have been established to make policing easier, not because telling someone to do something makes you responsible for their actions.
> you were negligent in your creation and distribution of propaganda
People are not inanimate objects. They are decision-making agents. The world is not a Rube-Goldberg machine. The soldiers who do the killing are responsible for their own moral attitude, and their own actions. You cannot be reasonably expected to know how your ideas will impact the minds of others, since every mind is a black box. Everything that contradicts this does so with generalizations too broad to be predictively useful.
> You can similarly take a small next step
This is where everything goes insane. Where does the responsibility end? You're trying to piece the butterfly effect back together.
Are people who make and sell bullets responsible for shootings? What about those that refine brass and lead? What about those that mine for ore? Creating economic demand, or promoting an idea, are morally neutral actions. People buying goods are in no way responsible for the conditions of their manufacture. People promoting ideas are in no way responsible for the actions a listener may take. Responsibility is zero-sum. Don't allow slavers and murderers to dispense with even a tiny portion of the sum responsibility for their actions. They must bear it all.
This reads a little hysterical to me. It's just a new medium of expression. Who knows, maybe the lack of genuine artistic merit, if there is such a thing, would lead more people to watch Jim Jarmusch flicks.
I watched that many years ago, but still see a difference here. Everything was a remix made by humans that put in their unique selves into the remixing process.
An AI model has no "unique self" to add to creation, at least not as we've understood so far.
I have the impression you think that it's OK for humans to learn upon other people's work and then create their own, but it's not OK for machines to do that. Am I right?
I don't think this position will lead to good outcomes in terms of progress for civilization.
I'm not ideological about this, I wish for a future with self-driving cars for instance.
The current situation is simply too rapidly evolving and can cause significant economic destruction, for instance if many middle-class jobs are lost without anything to replace them.
Change is inevitable, but reckless speed is not, that's a choice we make as a group.
Think of your favorite musicians. How many of them give attribution to where each musical idea came from?
The concept of art as exclusive property is very new. Throughout history, artists have built on one another’s works with no attribution or provenance. It’s really just the past 100 years — Disney, specifically - that have created the cultural mindset that the first person to express something owns it forever and everyone else has to pay them for the privilege of building that next work.
BTW I’m old enough to remember people decrying the rise of desktop publishing (“WYSIWYG”) as the automation of creativity.
I share your disdain for the geriatric political class, but I strongly disagree that this is a situation that needs to be managed. I say we let the arts return to the free for all they were for the fist 80,000 years or whatever.
“ Think of your favorite musicians. How many of them give attribution to where each musical idea came from?”
Certainly not for every individual idea, but good musicians do a lot of attribution. I got to know a lot of music I love now after following a mention on the liner notes of another musician’s album, or having them mentioned in an interview.
How you are describing that percentage breakdown is how I see this all playing out legally, such that royalty for IP holder = (tags in prompt)/(count of same tags in training data). I am oversimplifying this obviously but you get the idea. This approach would require collective effort of major IP holders but if record labels and streamers can figure out revenue pooling I don't see why it can't work elsewhere.
If the source material was mentioned for every generated image then I think it would be more like what you say. No percentages needed since that's not something we used to get from liner notes either.
They can't publish their training databases because that would be publishing of copyrighted material which is illegal. They can only train which is potential fair use.
It would be more accurate to say that they don't publish their training databases (including sanitized pointers to the copyrighted stuff) because they aren't sure that training is fair use.
They are sure, however, that it is a kind of infringement. Citing "fair use" is an admission of infringement - just a specific kind of infringement that is allowed.
I'd be very skeptical that AI would worsen the situation with music. For example, many pop music titles in last decades incorporate the same millennial whoop over and over and over again. I seriously stopped following pop music a long time ago and I can't imagine that AI can make it any more generic if it tried. I don't see a threat for non-generic indie music. AI is good at the generic stuff, as it usually statistically averages out the inputs.
when nirvana played MTV unplugged they mostly played covers from bands that influenced them
also no, disney did not invent the notion of authorship nor royalties. having enough honor not to take credit for someone else's work goes back millennia. attribution goes back millennia, otherwise we wouldn't know the names Sophocles, Aeschylus, Euripides.
Don't get me started on the pharaohs, mother fuckers loved carving their names into things.
> This is both amazing and saddening to me. All our cultural legacy is being fed into a monstrous machine that gives no attribution to the original content with which it was fed, and so the creative industry seems to be in great danger.
It is the same as what every human being is doing. We consume and we create. Sometimes creations are very good, but most of the time they are just mediocre. If the machines can create better average results, it will be due to the genius of the humans who invented those machines.
So we can be happy, that we have such beings among us and should cherish, that we will have better content to consume in the future. When you look at the world, you will see, that there are still plenty of problems to be solved for humans.
In the same way the "organic" movement took over food, and we want to feel human skill and touch in what we are consuming, I think we will see a similar swing in media.
People invest in stories. They also invest in other people. This is why people love seeing Tom Cruise over and over again in movies. Or why I'm going to go see the next Scorsese movie.
Reality television is designed to be addicting, and engaging, and it is very successful at that. I get pulled into The storylines whenever I watch. But I quickly turn it off. I don't watch it not because it is not enjoyable, but because I realize it is a cotton candy: empty trash that is not worth my time.
Artists are already often criticized for being "corporate." I think we'll see a similar effect for AI generated content. The hoi polloi and normies will slurp it up.
The true fans and passionate ones who give a shit aren't going to be fooled.
Proactively splitting up the menial tasks so that everyone is doing a little bit, inasmuch as they are able to, for a few hours a day, a few days a week, and getting paid a full-living wage for it seems like the way to go. Or, everyone pulls a Xiu Xiang from Rainbows End and goes back to high school.
The main obstacle to this is the pride and ego of the people who've "made it" up until now. Let go. Let society have nice things, even if you have to reinvent yourself. I don't think that creativity is endangered; art, uh, finds a way.
They manage it by meeting with Sam Altman while he runs around in incredibly expensive suits and tells them he will open an office in their country so they will all benefit.
I didn't go to film school or had any training in creative arts. I love the fact that I will have an outlet for creative expression where my text can generate image, video and sound. I can iterate over them, experiment with visualizations, and get better without technical barriers. Generative AI is making everyone an artist as well as a coder
You could take your phone and film something outside your house in an interesting way and I'd probably argue that's more "art" than whatever glorified stock video AI generates for you.
I'm interested where the tooling can go in the long run - can I scribble a picture of a cat and have it turned into an accurate 3D model, then have AI animate it based on text? That would be neat. Text prompts into "something" isn't, to me.
A part of the book Look to Windward by Ian M. Banks wrote of this. How the machine minds could comfortably write opera's greater than any man, but still humans would go to the theatre, just to appreciate it, but the impact was recognised in society. Of course that world was based on post-scarcity whilst we are not.
Automatizing creativity, some claim, is an endeavor akin to catching smoke with bare hands—futile, if not utterly fanciful. Yet, I can't help but ponder over the peculiar ballet of human ingenuity and mechanical precision. Consider for a moment, this strange juncture we find ourselves at, a place where the tools crafted by our own hands begin to sketch the outlines of what could very well be new breeds of creativity.
Let's muse on the notion that creativity, as we've known and cherished it, can be bottled up and dispensed by machines, up to a certain whimsical point. Beyond that? We stumble upon creations like these, novel tools that beckon us, the flesh-and-blood creators, to mold unforeseen "creativities." It's one spectacle to mechanize the known realms of artistic endeavor, quite another to boldly claim that machines shall inherit the mantle of creativity, henceforth dictating the contours of all future artistic landscapes.
History, that grand tapestry, is peppered with instances where the mechanical muses have dared to tread upon the sacred grounds of creativity. Take photography, for instance, a marvel of the 19th century that promised to capture reality with an accuracy that scoffed at the painter's brush. Or consider the digital revolution, which flung open the doors to realms of visual and auditory experiences previously consigned to the realm of dreams. The synthesizer, not merely an instrument but a portal, has ushered us into a new era of musical exploration, challenging the supremacy of the acoustic tradition.
Each of these milestones, while distinctly modern, echoes the age-old dance between creator and tool, where each step forward is both a continuation and a departure from the past. In this light, the question isn't whether creativity can be automated, but rather how our definition of creativity evolves as we, hand in hand with our mechanical counterparts, stride into the unknown.
Yes and no; I mean there are still painters around and we still appreciate their skill in the world of photographs. Sometimes it's only marginally about the finished product, but also about the work to make it.
The creators just don't care humans haha. I don't know why people still learning communications, writing, art or any other crafts. everything will be displaced by next AI.
We have been on this path for centuries. Compare the symphonies of 200 years ago with our current music or painting. We humans prefer quantity over quality.
I see nice paintings (not black squares or abstract nonsense) like all the time. Feels like more people can paint at the level of “classics” now. Of course they cannot surpass the deeper meaning of the originals, because they aren’t dead yet and there’s no mystery and legacy around their names. But otherwise they are pretty good at making cool pics.
I guess when you know why and how of something it doesn't feel surprising anymore. That's why two computers playing chess is not a fun event. People would however watch two humans playing even if their moves are secretly controlled by a machine. In contrast the generative content if (and will) reach indistinguishable levels, I doubt majority of people would care if a machine created it or a human? The biggest problem with AI which is disguised as its pros is that it is reachable to anyone and everyone and can be used as a weapon.
This is a similar problem manuscript duplicating workers in the Ottoman realm. The printing machines would take their job, but they resisted and lobbied against it in the courts of the sultan. They succeeded in delaying the adoption of printing for some time for the detriment of the people. Unfortunately, this has been the history of man and technology destroying something for the better or worse.
Some twisted the story as if the underlying issue was the religion but economic concerns were the real reason.
> Creativity being automated while humans are forced to perform menial tasks for minimum wage doesn't seem like a great future and the geriatric political class has absolutely no clue how to manage the situation.
May I introduce you to the entire history of humanity between 7 millennia before the invention of writing to approximately 50 years after the invention of communism? :P
More seriously: yes, we have no clue how to manage the situation. The best guess right now is UBI, which looks cool, but then at a first glance so does communism and laissez-faire capitalism.
Time for, ironically because humans are surprisingly bad at this, a creative idea for how to manage all this.
I feel like a lot of that frustration comes from seeing "arts and culture" as the pinnacle of anything when maybe it's just an overvalued side-effect of human wiring to avoid boredom.
Imho. it's just really hard to reason that average non-educational entertainment has a positive net effect on global society.
Seeing it this way makes it way less surprising that "art" and "creative entertainment" is one of the first things that gets hit by automation.
Painter/illustrator here. I mostly agree with you. I often have wondered if what I do is a total waste of time, long before generative models showed up. My close childhood peers became doctors and engineers, and there just isn't any comparison about our contributions to society. People get all whimsical when I bring this up and say "but what about the [spirit/feelings/blah]. I'm clear eyed about it though. If I could go back & re-roll my character sheet (i.e. slap my younger self into realizing STEM is cool while those doors were still open), I certainly would.
However, there's a line somewhere. I've spent most of my life around drab midwestern utilitarian/corporate/commercial buildings, and it has been noticeably depressing. In the periods where I've spent time in beautiful buildings, I have felt much better. Based on anecdata, I'm not the only one. There's something important & essential for humans about ornamentation & beauty. It's more than entertainment.
Humans can live on rice and kidney beans, but if one must do so without hope for more tasty options[0] it is demoralizing.
[0] lots of people are happy with spartan diets, but most often those people are doing so by choice.
I have ~50k in debt, and my GPA was garbage. Self study and hobbyist pursuits seem to be my place unless I find a specific field+program I really love enough to bet everything on.
You don't have to feel it, millions of people start painting or other artistic endeavors when retiring. Most of the time the [market] value is close to 0. AI does nothing here.
Anecdote: My grandma retired and started painting and has since passed. The market value of these paintings is 0, nobody would buy them as they are just average. But I will never get rid of them because she created it. They have value to me only.
Yes, but the point is that in a few years, there won't be a difference. Those clickbait accounts already exist for AI generated images. How many impressionable or young people have been fooled into believing history that never happened?
More importantly, how can these accounts subtly direct the generations to instill modern ideology or politics into "historical" images, giving them historical credibility? Think of all the subtly white supremacist "retvrn" accounts, for example, falsely recontextualizing inventions and accomplishments to support their ideology.
We all need to be thinking much more creatively and cynically about how these tools will be abused. The technology will get better. The people who want to abuse it will get smarter. And your capability to distinguish fake information is likely much worse than you believe - to say nothing of younger people who have less context and experience to form a mental "immune system".
>How many impressionable or young people have been fooled into believing history that never happened?
I would say, all of them. Since the dawn of history. Actually, far before, as treachery certainly precedes speech itself by a few million years in the struggle to survive game.
Just to take a contemporary western (mostly?) thing: how did it went last time you looked straight into the eyes of kids to reveal them Santa Clauss is a lie and yes almost all adults in their society are into that evil conspiracy? And what about the adult around you deeply attached to their national myths, not even mentioning all the folklore around their afterlife beliefs?
But don’t worry, everything is going to go well, I promise and you know you can trust me. :)
only if you consider "historical footage" to exclusively mean the "[original] historical footage [stored in archiving]" versus e.g. "historical[ly accurate] footage"
if "historical" is going to be used subjectively with no further qualifying statements then the meaning of "history" will be subjucated to the context it's being presented in, I don't see it's use here as contradictory
I love how they show the failure cases: compare that with Gemini 1.5 pro's technical paper that carefully avoids any test where it does not seem like a 100% perf! I think confronting your failures a condition for success, and Google seems much too self-indulgent here.
Imagine a movie script, but with more detail of the scenes and actors, plugged into this.
The killer app for this is being able to give a prompt of a detailed description of a scene, with actor movements and all detail of environment, structure, furniture, etc. Add to that camera views/angles/movement specified in the prompt along with text for actors.
In the future, you won't need to do any of that.
Your own AI will generate a movie for you and ask you if you feel like watching a movie. You will love it. Because it will know your taste, your hobbies, your friends, ads, chat history, website you visited, ..everything.
I am a huge proponent for AI, especially in film making. But I hope that real people have the opportunity to write, act, and direct themselves, or with a small group of semi-professionals or even amateurs, their own blockbuster big-budget-looking movies.
Absolutely unreal. Kinda funny how some people are complaining about minor glitches or motion sickness when this is the most impressive piece of technology I've seen. Way to go, OpenAI.
How is this done technically? So many moving parts and the tracking on each is exquisite.
My initial observation is that the camera moves are very similar to a camera in a 3D modeling program: on an inhuman dolly flying through space on a impossibly smooth path / bezier curve. Makes me wonder if there is actually a something like 3D simulation at the root here, or maybe a 3D unsupervised training loop, and they are somehow mapping persistent AI textures onto it?
The Lagos video (https://openai.com/sora?video=lagos) is very much how my dreams unfold. One moment, I'm with my friends in a bustling marketplace, then suddenly we are no longer at the marketplace, but rather overlooking a sunset and a highway. I wonder if there are some conceptual similarities how dreams and AI video models work.
Yeah that one has more surreal elements every time you watch it: the people at the table are giants compared to everyone else, someone is headless, the kid's hand warps around like crazy.
Those samples are incredibly impressive. It blows RunwayML out of the water.
As a layman watching the space, I didn't expect this level of quality for two or three more years. Pretty blown away, the puppies in the snow were really impressive.
Imagine someone combining this with the Apple Vision Pro...many people will simply opt out of reality and live in a digital world. Not that this is new, but I'll entice a lot more people than ever before.
Had the same thought. Seems like we’re entering the era of generative AI and mixed reality in a very real way very soon.
As much as I love the technology, I’m really not looking forward to this becoming ubiquitous. Time and time again we’ve allowed technological progress to outpace our ability to weight the societal pros ands cons.
Smartphones and the rise of image-heavy social media has rapidly changed social norms. Watch a video of people out in public 20 years ago: no screen to distract them at bus stops, concert events, or while eating dinner with friends. And if that seems trite, consider how well correlated the rise in suicide rates is with the popularity of these technologies.
Not sure if this makes me a luddite or if the feeling is common in this crowd.
Presumably the Post-atomic horror set back technology for a while, so we should be able to expect TNG-level technology before the war. This also explains why Kirk's Enterprise uses datatapes.
but you cannot walk/feel it, just watching. It's still a huge gap to reality, less so, but you will still feel it's fake very vividly because those senses are missing.
Watching it is enough for a lot of people. Watching 1080p first person extreme sport videos on youtube is almost too compelling to me. I have to turn it off because it feels addictive.
Maybe some sort of implants that can generate senses. Would be 100s of years because you can say simulate weight/pressure and pinpoint accuracy if feeling friction.
in their research paper it says "These capabilities suggest that continued scaling of video models is a promising path towards the development of highly-capable simulators of the physical and digital world, and the objects, animals and people that live within them*." they are well set on that happening
Cats are like hands: they are hilariously hard for generative models and then after thinking about it, you realize that cats/hands really are hard. I mean, look at photos of a black cat curled up where it might have its paws sticking out at any angle from anywhere from a solid black void. How the heck do you learn that?
Yeah, you just can't let all media, all the cost and hard work of millions of photographers, animators, filmmakers, etc be completely consumed and devalued by one company just because it's a very cool technical trick. The more powerful these services become the more obvious that will be.
What OpenAI does is amazing, but they obviously cannot be allowed to capture the value of every piece of media ever created — it'll both tank the economy and basically halt all new creation if everything you create will be immediately financially weaponized against you, if everything you create goes immediately into the Machine that can spit out a billion variations, flood the market, and give you nothing in return.
It's the same complaint people have had with Google Search pushed to its logical conclusion: anything you create will be anonymized and absorbed. You put in the effort and money, OpenAI gets the reward.
Again, I like OpenAI overall. But everyone's got to be brought to the table on this somehow. I wish our government would be capable of giving realistic guidance and regulation on this.
I'm kind of excited to see how scifi authors will tackle the generative AI revolution in their novels.
As of now, the models still need large amounts of human produced creative works for training. So you can imagine a story set in a world where large swathes of humanity are regulated to being basically gig workers for some quadrillion dollar AI megacorp where they sit around and wait to be prompted by the AI. "Draw a purple cat with pink stripes and a top hat" and then millions of freelance artists around the world start drawing a stupid picture of a cat because the model determined that it had insufficient training data to produce high quality results for the given prompt. And that's how everyone lives their lives....just working to feed the model but everything consumed is generated by the model. It's rather dystopian.
I have a novel I've been working on intermittently since the late 2000s, the central conflict of which grew to be about labor in an era of its devaluation. The big reveal was always going to be the opposite of Gibson's Mona Lisa Overdrive, that rather than something human-like turning out to be AI, society's AI infrastructure turns out to depend on mostly human "compute" (harvested in a surreptitious way I thought was clever).
I've been trying to figure out how to retool the story to fit a timeline where ubiquitous AI that can write poems and paint pictures predates ubiquitous self-driving cars.
I would say it's very profitable in terms of ideas...if you put the work. The problem is that most main-market sci-fi is not about ideas, but about cool special effects and good vs bad guys.
> As of now, the models still need large amounts of human produced creative works for training.
That will likely always be the case. Even 100% synthetic data has to come from somewhere. Great synopsis! Working for hire to feed a machine that regurgitates variations of the missing data sounds dystopian. But here we are, almost there.
Agreed, by some definitions, specifically associating unrelated things, models are already creative.
Hallucinations are highly creative as well. But unless the technology changes, large language models will need human-made training substrate data for a long time to operate.
It's bimodal. AI can automate a lot of low level knowledge work, but as wide and deep as its knowledge is, it is also incredibly superficial when it comes to logic and creativity. What it's going to do is hollow out the middle class, as creative people who know how to wield AI will become wealthy while the majority of white collar workers are forced into trades.
A major follow up to GPT-4 later this year is rumored to be (far) superior at logical reasoning than GPT-4. What's likely to happen if that becomes real?
That might let it encroach more into some fields like law where it's almost good enough already. Shitty time to be a junior lawyer, firms are going to hire and promote people not for their legal skills but for their ability to manage/attract clients.
In general though, I don't think the extra reasoning ability is going to enable it to displace that much farther than it already will, GPT lives in a box and responds to prompts. When it's connected to multiple layers of real-time sensor data and self-directing, that'll be another story.
There were independent efforts to create AI agents since last year as well. AutoGPT and BabyAGI iirc. They didn’t go far probably because the LLM used was not good enough for that.
> AI is automating the creative, intellectual work and leaving the rest to us.
Indeed, there is a risk it completely devalues creative jobs. That's ironic. Even if you can still use AI creatively, it removes the pleasure of creating. Prompting feels like filling Excel sheets while also feeding a pachinko machine.
if it was actually AI, instead of a stochastic parrot, we could ask it to design robots that could do the manual labor that we still have to do, because we haven't been able to design robots to do the manual labor.
Unfortunately, LLMs aren't intelligent in any way, so you cannot ask them to synthesize any kind of second-order knowledge.
This is why they won't take away the creative work, either. They are fundamentally incapable of creating anything new.
This is the beginning of the end for many of them too. Look at the opening line of the page:
> We’re teaching AI to understand and simulate the physical world in motion, with the goal of training models that help people solve problems that require real-world interaction.
Text-to-video is just the flashy demo that everyone can understand after exposure to text-to-image. Once the model can "simulate the physical world in motion" it's only a few steps away from generic robotic control software that can automate a ton of processes that were impossible before.
Humans still have the benefit of dexterity and precise muscle control but in the vast majority of cases robots can overcome those limitations with better control software and specialized robotic end effectors. This won't soon replace someone crawling under a house or welding in awkward positions, but it could for example replace someone who flips burgers or does manual labwork.
This could eliminate the limiting factor for automating many manual processes. (ruh-roh)
Less glibly, no matter how good you are at following instructions, tearing out a wall filled with water than can destroy your home, fiberglass insulation that can damage your lungs and electrical wiring that can kill you will never be something I’d recommend a layman do. No matter how good the ai tutorials are.
Not when intelligence is cheap and highly abundant. Perfecting general robotics as an improvement on humans will be quick. The upper limit of strength and consistency is much higher.
Today in the real world AI can replace very little of designers, programmers, etc. Lots of potential and extrapolation, sure. but hasn't happened. What has actually been produced by AI has been panned as not quite ripe yet.
Same with robotics. Lots of potential, but hasn't happened yet. If you read the description, Sora, is based out of trying to simulate the physical world to solve physics based problems. Something that would be perfect for the next leap in robotics.
Think about it. Sora demonstrate AI can understand real world physics to a scary degree.
If you use Sora like models to imagine what actions needed to be taken, then realize it, well, the only thing left is to create an arm/fingers that can took action, then you are done.
Machines have replaced a lot of blue-collar jobs alright. It's just that most of it happened during the Industrial Revolution, so we aren't even aware of all the shitty (and not-shitty-but-obsolete-nonetheless) jobs that used to exist.
It's automating some of the craftsmanship part, which is substantial, but in a sense, it also threatens the creative part.
It's already very tempting for large entertainment businesses to create lazy remakes as it involves less risk. Automating creative jobs will create a shift at the production level but also on the receiving end: the public.
that would never happen because someone owns the robots and rich people can afford more robots than poor people and rich people aren’t rich people if poor people aren’t poor
Come on, don't you see that the capability to understand the physical world that sora demonstrates is exactly what we need to develop those household robots? All these genAI products are just toys because they are technology demonstrators. They're all steps in the way to AGI and androids.
sensorimotor control is imo not at all the bottleneck. Teleoperated androids could do lots of useful things right now, but the AI is lacking to automate them.
well I let's say you want to make the coffee and we split that task into roughly two subtasks. The first is to imagine what motions are necessary to do that. How does the coffee cup have to move, how does your hand have to move to grasp it, etc. The second part is to find a way to use your muscles or actuators to execute those imagined actions.
I claim that the first part is the more difficult one and where we have the bottleneck currently. Furthermore, generative video AI is exactly the kind of thing that would give a model an understanding of what kinds of things have to happen in order to make coffee.
Somehow, according to that logic, and in general the logic of all AI danger hysteria, humans have no agency in determining what the limits of what AI is fed and of its use and abuse.
Some humans do - the investors and executives in AI tech companies (and the legislators who theoretically could regulate them) , who all stand to make a lot of money from every one of the "AI danger hysteria" scenarios, and are therefore highly motivated to bring them to fruition.
The rest of us have no choice. Despite millions of artists, animators, etc. all being resoundly opposed to AI art, the models that infringe their work are still allowed to exist, and it seems they're fighting a losing battle.
A lot of people are being "hysterical" because a lot of people don't have a choice.
To be clear, the problem of these scenarios is tightly intertwined with the problem of unfettered capitalism and wealth inequality in general. Food and shelter require money, and we get money by working a job. If millions of jobs disappear overnight, then of course millions of people are going to be distressed over no longer having ready access to food and shelter.
The idea of "just getting another job" doesn't scale to the destruction of entire industries employing tens of millions of people. This is how depressions are made.
The idea of "the depression will end someday" is not only not necessarily true as wealth inequality skyrockets, but is also cold comfort to the people who will lose their houses and for some, lives, due to the disruption.
A different economic system could perhaps allow us to appreciate these technological advances without worrying about them displacing our ability to live. But the American political system consistently and firmly rejects any ideas not rooted in social darwinist capitalism.
For your sake, I hope your resume is very impressive.
If millions of jobs disappear overnight it means AI is amazingly good, which means people will also have AI empowerment on a whole new level as open source trails companies by 1-2 years. Everyone will just order their AI "take care of my needs", maybe work along with it. You got to agree that we already have some amazing open models and they are only getting better - that empowerment will remain with us in times of need.
"Companies employing people" will be replaced by "people employing AI". Open models are free, small, fast, trainable and easy to use. They capture 90% of the value at 10% the cost, and are private.
"Companies employing people" getting replaced by anything is pretty dangerous in an economic system where employment is synonymous with having food and shelter. It won't matter that AI could help me keep a to-do list or generate pretty videos if I don't have a job or income.
What we're looking at is a massive decrease in the relative economic value of the average human's work. If the economic value of a hundred people is less than what the company can produce with a single human operator running AI models, then those 100 people are economically worthless, and don't get to eat.
We drastically need to tax the usage of AI models on the huge windfall they're about to create for their operators, and use that to fund universal basic income for those displaced. Generally speaking, as automation and wealth disparity skyrocket, UBI will be required to maintain any semblance of the society we currently have. I am incredibly pessimistic about the chances of that happening in any real way though.
I would agree. While we are seeing all this creative work get automated by AI, how big of an impact would that really have on the economy?
Fully-functional autonomous driving will have a much larger economic impact - and that's just the first area where autonomous robots will come into our lives.
People on HN like to split hairs and make muddled juxtapositions about human rights and AI model capabilities. But this is something that people and governments around the world would have to reckon with very quickly, since the rate at which generative AI technology is advancing, there could be hundreds of millions of people who’re unemployed and have no way to find work.
The quickest way to address this would be an extremely high tax rate on any generative AI model, say 500%, while the government figures out what’s the best way to sustain an economy (such as UBI) with a diminishing set of consumers as more people are pushed towards unemployment.
Taxes are meant to capture some of the profit that is made by a business entity. You could use a local model, but if you sell some kind of product or service, the tax would be levied on you. Not declaring that properly, of course, is tax evasion :)
I suspect what was meant is something like 500% VAT, where if a generative AI charges a customer $6, then $5 goes to the taxman and $1 to the AI company.
Typical argument against technological progress "We should ban technology to stop it doing what humans can do in a fraction of time and resources".
Can see this create an explosion of new Content from aspiring Film, Story tellers and cut scenes from Game creators that previously never would have the budget or capabilities to be able to see their ideas through to creation.
If we had a safety net where career progression and time/money invested in training was unnecessary to sustain life, then maybe. Until then it feels like a bit of allowing a few people to plunder and own the collective output of millions.
This moment seems like trade guilds revolting against free craftsmen. What AI is essentially doing is learning skills from people according to their works and then helping everyone according to their needs. It's more rad than open source.
This is not plunder, it is empowerment. Blocking generative AI would be a huge power grab for copyright owners. They want to claim ideas and styles, and all their possible combinations.
Gen AI need only ensure it never reproduces a copyrighted work verbatim. Culture doesn't work if we stop ideas from moving freely.
I agree that preventing technology from dispersing generally prevents the creation of wealth. However, given our current economic structure, the downside in instability of a livelihood has dramatic effects on swaths of people who were unlucky enough to be disrupted -think of the dramatic costs of retraining, healthcare access, the high costs of diminished earnings, inability to accrue wealth and retire. Perhaps we could socialize these costs, but we don't and are unlikely to do so.
Another issue to look at is the lack of ownership of the tools of your trade. In a context where many use AI models to competitively produce, hosts of AI models essentially own the access to your trade - thereby able to charge a toll, or privilege certain behaviors for any who strive to make living with these tools. (of course this is happening now with plenty of software products). The ultimate trajectory of this is not democratization of a toolset, but a transfer of wealth from labor to capital. And keep in mind that the labor share of income has been steadily declining for half a century.
The creation of wealth from AI ultimately depends on the strength of democratic and pluralistic institutions that safeguard ownership of your trade, democratized access to capital, and safeguards of welfare in the environment of creative destrcution. Otherwise you wind up with the cotton gin.
We all stand on the shoulders of giants. Yes, I want artists and other creators to be compensated fairly for any work that they contribute into training datasets, but outside of that there is no moral responsibility AI creators should feel towards those whose potential careers would be impacted.
They aren't. Every person is free to use AI or not.
Go blame your fellow consumers if you don't like the fact that they prefer AI.
These are choices that everyone makes. AI companies alone aren't forcing everyone to use their cool new tools. Instead, thats a decision that 10s of millions of people are making every day.
“Many were increasingly of the opinion that they'd all made a big mistake coming down from the trees in the first place, and some said that even the trees had been a bad move, and that no-one should ever have left the oceans.“
Does it not just shift where we (as people) perceive value? If the cost of content drops to effectively zero, it seems reasonable that we would not value it so highly. If so, it does not mean that people do not value anything, but it may mean we start associating value with new or different things. While this may disrupt industries, I do not think we have an ethical or legal duty to those industries to remain profitable.
GREAT response imo, I’ll try to remember this concise phrasing. I think this highlights that people aren’t worried so much about changes coming to them as consumers, and are much more worried about what “industries no longer remaining profitable” means for them as a laborer.
Yup! Technology is powerful. It impacts people's lives.
I love tech, but if you take the stance that it's okay to hurt people for the sake of technical progress, you get into some very dark and terrible places...
It doesn’t help that the tech industry is trying to make it seem black and white. Like you’re either endlessly optimistic and let tech run rampant or you’re a depressing doomer pessimist. We should reject this framing whenever possible.
I think dismantling creative fields like this is completely different from automating manual labor in a way that makes humanity more prosperous. I don't see what the upside is of this -- it's not making creative work better, it's devaluing creative work and disenfranchising creatives.
> it's okay to hurt people for the sake of technical progress
That's a strawman. The real view is that protecting jobs that are made extinct by technology and automation is historically a bad idea because it leads to stagnation and poverty. It's better to let people lose their jobs, and for those people to find other jobs, while supporting them with a social safety net while they make the transition. Painful for them but unfortunately very necessary for a prosperous society.
Instead of using this outrage and energy to push a political will to grant something that benefits everyone forever, we should use it to grant something that helps prop up a few people in dying industries so that they can stifle innovation which would lead to a creative revolution?
What no one is asking is: 'it this makes it easy for anyone to be an artist, a director, a musician... what are we going to get, and will it be worse than what we have now?
> What no one is asking is: 'it this makes it easy for anyone to be an artist, a director, a musician... what are we going to get, and will it be worse than what we have now?
Everyone is asking this.
But that's also not the only question. The one you're ignoring here is: If these tools enable one artist to do the work of a hundred, what happens to the other 99?
AI boosters have as yet offered no satisfactory answer for this question. Given the intimate involvement some of them have with politics at the national and global level, this absence constitutes reasonable grounds for suspicion that no answer is intended or forthcoming, and that suspicion is what's asking here to be addressed.
> If these tools enable one artist to do the work of a hundred, what happens to the other 99?
Not really -- as people have gotten more efficient at their jobs, we tend to just produce more/better things, not impoverish a bunch of people. If one person can day (8 hours) making a shoe by hand, and one person can make a shoe in an hour using a shoe making machine, then we don't have one less shoe maker, we have two people making 16 shoes a day. As an effect, shoes are now much cheaper, so they aren't only worn by rich people. If the one-shoe-per-day maker refuses to use a shoe making machine, he or she can upsell their 'hand crafted' shoes to rich people who want to distinguish themselves.
Believe me, I am not a 'free market fixes everything' person, at all, but in these cases, that is how it has worked since the industrial revolution. This is not a new process (automation making a task much more accessible/efficient) and this is not a new complaint (what happens to the people who made a living doing task).
Change is scary -- and everyone has the right to be afraid of an uncertain future, but I can't recall an instance of the regressive approach actually working to allay the fears of those who imposed it. Yet, we all see huge reminders of how our lives have been improved by making hard things easier and accessible to more people.
The argument as presented so omits even the possibility of harm being done anyone in this process as to seem as if it seeks to foreclose the thought at root.
It would not surprise me if anyone called this pollyannaish, or even Panglossian.
You don't really touch at any point in your argument on even the possibility someone might be harmed, in the process of entire segments of the labor market being automated. Why is that?
It is assumed is anything with any kind of scale that harm with occur.
Did anyone get harmed when photography was used to supplant portraits? Did anyone get harmed when mail started getting sent by rail instead of horse? Did anyone get harmed when air travel became possible? Did anyone get harmed when we supplied electric power to homes?
I have an idea -- why don't you propose a solution to AI ruining creative jobs and we can apply that standard to it.
Price in the externality. The multiple of US GDP that OpenAI currently seeks in funding should certainly suffice to fund UBI, and if that slows down OpenAI's development of new capabilities, then that should still be preferable to the alternative of OpenAI being enjoined from doing business until that is done.
Of course you may respond that this is unrealistic, which it is; it requires a government capable of acting via regulation in defense of its citizens, and so nothing like it will be done.
The social safety net component of your idea is both extremely important and not at all likely in the modern ultra-capitalist, "even healthcare is socialist extremism" political atmosphere.
Maybe mass unemployment will create a sea change in that mentality, but most of the people who's opinions need to be changed will probably just laugh at "the elites" getting screwed over.
Is it? What is another example of a technological leap that made a certain class of workers redundant while also continually relying on the output of these same workers to be feasible in the first place?
The current batch of LLMs is in the same class of technological revolutions as Napster and The Pirate Bay. Immensely impactful, sure, but mostly because of theft of value from elsewhere.
New data can still be created using AI and curation, couldn't it? New works, incorporating AI or not, still enjoy copyright protections that one can monetize by selling access to that specific work.
I really don't think that's true. Essentially the argument is that these models are more or less just outputting the work of others. Work already done- not theoretical future work, which is what people usually criticize new technologies for.
The question here is really about whether it's sufficiently transformative, or whether that's even the right standard to be applied to generated media.
Sorry no. If there was even the remotest possibility that everyone could be brought to the table, none of these would even exist.
Training a massive model like this is a risk, and no one is going to take that risk without some reward. You can complain OpenAI is going to too much of the value, but its value that would have otherwise never existed. It's value.
Research on creativity and competition points to this. Essentially, creativity occurs when there is some expectancy of increasing competitiveness. However when the expectancy of value capture from your effort becomes less clear, or diminished, creativity stops altogether.
(as pointed out in the "Freakonomics" episode highlighting this reaserch)
Things that can be easily reproduced already have little value, the people who produce those things have adapted to focus on brand, and that's just how it's going to be from now own.
Reminds me of an interview with a Korean pop music producer I watched 15 years ago.
South Korea had a high % of broadband penetration earlier than many Western countries, and as a result physical CD sales crashed very hard, and very quickly. So he asked himself, what's the most analog good I could sell? It's people. And went the pop idol / personality marketing route with great and lasting success.
I dont think they're saying its up to tech companies to decide what has value, more that the development of new technology itself ends up deciding for the rest of the world how things are valued.
It's been this way for 10,000 years since the invention of the wheel. New inventions change how things are valued by making it easier for people do more work with less time.
The creators who create media can also use these tools to create more media faster, as can novices. It's not like OpenAI literally eats the media, never to be shared with the world again.
I create media for a living, painstakingly creating stuff from scratch in 3D. This tool will not help me, it will help clients avoid ever having to contract me. The main beneficiaries of this are holders of capital
Oh I see, they're not eating the media, just extorting the creators into paying OpenAI in perpetuity to use the tool derived from their own work, or face becoming uncompetitive with their peers who do use it. What if landlords, but for media creation, and they don't even have to pay for the land in the first place. That's fine then.
> pay a subscription to OpenAI in perpetuity in order to remain competitive with their peers
This is how technology works in general and should not be vilified. Someone comes up with a better way to do things (in this case bringing creative ideas to life) and charges a premium on top of that for their efforts. If the current wave of creators doesn't like it, then they should instead make something people want more than what their competition has to offer.
Either way, this is why local open source models are critical, so that everyone can benefit without needing to pay any single party.
If a company were founded tomorrow which allowed you to stream unlicensed TV shows and movies for a monthly subscription, undercutting Netflix and Amazons licensed streams, that wouldn't be described as "a better way to do things" just because their customers prefer it for being cheaper and easier because all the content is in one place. The difference between that and what OpenAI is doing is just degrees of abstraction, either way they're deriving value from others work without compensating them, and actively undermining the ongoing creation of the work they're appropriating, while simultaneously relying on the ongoing creation of that work to keep feeding their machine.
IP law has yet to decide whether my interpretation of the situation is correct in the legal sense, but I find it impossible to see "ChatGPT absorbs the work of writers/journalists and sells a superficially reworded version without attribution or compensation" as anything but theft obfuscated behind lots of fancy math. It's only going to get worse if LLMs end up displacing traditional search engines, so one day you'll publish an article and get exactly one impression from GPTBot which then turns around and figuratively copies your homework.
Forgive me for thinking that it may be difficult for independent artists to compete against the trillion-dollar groundbreaking plagiarism machine that is actively plagiarizing their work faster than they can produce original work, without consequence, and suffocating them under a deluge of generated works.
This is an extremely different difference of scale, which does constitute a meaningful difference from prior technologies.
It’s difficult for independent artists to live as independent artists today, even without the specter of a “trillion-dollar groundbreaking plagiarism machine”[0]. So far, we’ve still been producing original work, primarily because it’s what we do even when we’re not making money from it. It’s a blessing and a curse.
This is not to dismiss the concern. I simply wanted to state that artists will find ways to keep moving the creative bar forward.
[0] I really like this turn of phrase, thank you for sharing it.
Interestingly a lot of movies flopped in 2023 not because of bad visuals, but because their writing was bad. Hence, I believe the demise of the movie industry is overstated. I can see completely new forms of entertainment coming out of this. Probably Youtube will be the biggest winner as the social network with the highest monetization and reach.
You can't regulate it because it will just be outsourced to another country.
Nope, we are headed towards deflation. Families that need only a single worker to support everyone, and even support extended family, and less time working overall.
I don't disagree with your basic sentiment, but it's worth pointing out that, on some level, the *entirety of artificial intelligence* is not much more than a "cool technical trick."
I am getting sick of these "people can't be allowed to make their own nice things easily, because of a pugnacious (and very online) interest group that wants to keep getting money" takes.
capture the value of every piece of media ever created
In what way does “I have a computer that can make movies” mean “I have captured the value of every piece of media ever created?” What do you mean by “value”? In my biased view, this amazing new technology couldn’t possibly be a better time to fix our insane notions of property, intellectual or otherwise
Are you against records? Because the technology to record songs and play them back at your leisure killed an entire industry of live performers / instrumentalists?
The call for live music drastically shrank when it became trivial for any business or residence to play music on command.
Are you against automatic language translation? I can positively guarantee that the training data that they used to be able to create significantly better translation models was not authorized for that purpose.
The entire translator industry has been steadily shrinking ever since the invention of automatic language translation.
Etc etc etc.
There's obviously two aspects of this complex social issue right now.
1. Whether or not the usage of publicly available media as training data is legal/ethical.
2. Whether or not the output of these types of generative systems (even if they're trained on "ethical" training data) which may result in the displacement of many jobs is legal/ethical.
I'm neither for nor against AI (LLM, diffusion, video, etc), but if you are going to take a stance, then you have to be consistent in your view.
You don't get to cherry pick - I don't want to see you using chatGPT, copilot, stable diffusion, DALL-E, midjourney, sora, etc.
It's weird that a call for generative AI to be more equitable towards the people whose creative work powers it is being interpreted as somehow being against tech, against AI, or that I think technological advancement should never make jobs obsolete.
>Yeah, you just can't let all media, all the cost and hard work of millions of photographers, animators, filmmakers, etc be completely consumed and devalued by one company just because it's a very cool technical trick.
Oh man, how I miss it when ice was hauled from the Arctic in boats.
You recognize the difference, right? Modern freezers don’t rely on people shipping ice from the Arctic. Generative AI does rely on people continuing to create media.
Do you feel the same about the hard work of knocker-uppers having been devalued by the invention of the alarm clock? Or is it just the (relatively) highly paid intellectual workers that "cannot be allowed" to be replaced with machines?
It doesn't really matter, because if this is possible then it will not be exclusive to OpenAI for long. It's simply just something that can exist. There will be open source versions of everything lagging 1-2 years behind or something.
Never ever will there be everyone at the table. This is not how the Internet works. It is not how the world and humanity work. If OpenAI doesn't do it, the next big player will. China will. Maybe it'll soon not even need China because it'll be so easy to deploy.
There is no stop now. It's too late for that. Time to think about the full development and how we'll handle that. How we as people will be able to exist next to it. What our purpose in the world is supposed to be. What the purpose of "value" is. What the purpose of "economy" or "the market" is.
It's worth remembering that "intellectual property" is an entirely artificial and fairly recent construct. Humanity did fine for thousands of years without it, and I'm not going to shed too many years if OpenAI blows it up.
I see the validity of this concern in the short term, but long term I feel like this is a bit doomsday. I don't want anyone's livelihood to get shafted, but realistically I see this as lowering the barrier to creating videos / proofs of concept--which is a good thing (with a lot of caveats and asterisks).
I wonder why the input is always text - can't it be text, as well as a low quality blender scene with a camera rig flying through space, a moodboard, sketches of the characters etc.?
My guess is because the models were all trained on text. You could do as you say, but I think it would go: blender video {gets described by an AI into text}-> text prompt -> video.
These samples look pretty amazing. I'm curious the compute required to train and even deploy something like this. How would it scale to making something like a CGI Pixar movie?
It's impressive, but I think it's still in the same category as even the best LLMs: the demos look good and they can be quite useful, but you can never quite trust them. You really can't just have an LLM write a whole report for you - who knows what facts it'll make up, what it'll miss? You really can't use this to generate video for work, who knows where the little artifacts are (it's easier to tell with video).
The future of these high-fidelity (but not perfect) generative AI systems is in realizing we're going to need "humans in the loop". This means designing to output human-manipulable data - perhaps models/skeletons/textures instead of whole output. Pixels are hard to manipulate directly!
As for entertainment, already we see people sick of CGI - will people really want to pay for AI-generated video?
> The future of these high-fidelity (but not perfect) generative AI systems is in realizing we're going to need "humans in the loop"
Last weekend my 7 year old decided he wanted to make and sell a shirt with an image of a space cat shooting a laser gun. It took him like 1 minute to use free Dalle3 to make and choose an image. Then I showed him a website to remove the background. Then I showed him a tool to AI-upscale the image. Then we uploaded it to Amazon Merch, it got approved after a few hours, and now it's for sale on Amazon. It took us maybe 10 minutes of effort end-to-end. Involved no artists.
Funny enough, Amazon is full of AI-designed merch, there were like 7 pages of shirts with space cats with lasers.
I am a CG artist and Director and this made me so sad. I am watching in horror and amazement. I am not anti AI at all, but being on the wrong side of efficiency, for the individual this is heartbreaking. its so much fun to make CG and create shots and the reason its hard (just like anything) makes it rewarding.
Ex colleague then! I'm kind of glad I went out of it all now that I see all of this, but on the other hand it's also an amazing opportunity unfolding, as long as it's directable. What a great toolset! For what you've had to have army of people, freezing ass on location, working with actors.. soon gone. Well, if you want it to. On the other hand, look at what happened to imagery, concept art in general. For the better part it cheapened it. Turned it into this mass produced, easily available thing that it's not special anymore. Skills are still needed to produce exactly what you want, but the special flair is kind of gone. It will need way more energy and creativity now to stand out.
The point is by doing you become really good in creative fields. in any field. Prompting is not doing. What makes you a really good programmer? Writing code.
the pursuit of mastery is at the essence of any craft.
I can't help but worry that this will make it too easy to create movies and the product will be of much lower quality. There is precedence here in the music industry. A recent report came out that said that about 70% of music sales was catalog music, implying that people are buying less new music than old. I personally feel that's because the new music just isn't very good and one of the reasons is, it's too easy to make and distribute music now.
That is a ridiculous take. Look at the absolute SEA of bottom-barrel content flooding every single streaming platform. For people at the top of the studio system, they are already living out their AI power trips, just in the meatspace.
The entire industry is already turning out terrible shit, but doing it by wasting hundreds of thousands of actors, production teams, and studio dollars in order to churn out that nonsense.
Meanwhile, there are millions of latent storytellers, who, for whatever reason (but primarily:
not born into extreme wealth and nepotistic connections) could never express their ideas in motion/cinema at such ambitious scales.
By putting this power in the hands of actually talented writers and storytellers, you create a completely new market of potentially incredible works of art.
Sure. But you have to admit that you also create a new market of low effort garbage art. The question is which is bigger, and where the money will ultimately go.
"Things are already bad. How could you be mad about making it much easier to make things worse? Quality isn't compatible with today's business ambitions."
I think it's worth remembering that all of these AI's work by having an unbelievably large number of weights. So many weights that it's all an uncontrollable black box. On the other hand, your work is all about having control, and I don't personally expect your work to lose value for this reason.
Another thing to think about is what the AI is designed to do. Without knowing the details, I would expect it to be trained to produce the 'most likely' output given the prompt. Consequently, I would think being inventive is against its design, and 'most likely' is effectively that same as 'average'.
Why the terror? Your job will change a bit but won't be gone. You would guide the output and make prompts not with text but your own video CGI shorts to make things 100% to your liking and the AI will do the rest of the dirty work. You productivity will grow and quality of your work too. You would be able to make an AAA movie all by yourself on a laptop. Since everyone would be able to do the same, the fight for the imagination and inginuity in scripting and artists view would skyroket. :) IMHO
You are rather cavalier about other people's livelihoods. There will be budget for maybe 10% of the people currently employed, and yes, they will be making use of the new tools and they'll adapt. The other 90% are going to be doing doordash until they can figure out a new career.
Initial displacement will happen and it will require time for society to adapt and new industries to mature. The printing press significantly reduced the cost of producing books and other printed materials, which led to a dramatic increase in the availability of books, literacy rates, and the spread of knowledge. This technological advancement didn’t just replace the scribes; it created new jobs in printing, publishing, book selling, and eventually led to the creation of new genres of literature.
Who lost their jobs to the printing press? The monks who were the only scribes back then? They got their time freed to spend it on other duties in the monasteries and mayhaps even more time to read other books rather that to scribe them. So the level of education grew even for them.
The same will be for the FX artist and 3D artists etc. The level of their work will grow, they will spend less time on dull work and more on tinkering with tiny but more important things like ideas, emotions, art overall etc.
The terror is because companies want to maximize profits and a great way to do that is to minimize costs.
If you have a team of X people producing Y pieces, and now X people can produce 10Y pieces, everything is fine as long as the demand for pieces keeps up. But if your company really only needs Y pieces or really any amount less than 10Y then the easiest thing for a company to do is go, "We don't really need X people, let's fire some"
Getting fired, in America at least, means loss of healthcare, income, and if it persists long enough housing. Most people are terrified of being homeless, broke, and without access to medicine.
AI causes the supply and demand to change by creating additional supply of pieces through increased productivity.
It's cold comfort to someone getting fired to tell them "If demand had also increased 10 fold you wouldn't have to sleep on the street."
The actual living human being who has had their livelihood destroyed probably isn't any less scared of their fate because you cleverly tut at them and go, "In actuality the AI didn't do anything bad to you, it just created a glut of supply and the market demand didn't keep up."
Depends on what you think is "dull work". I think there are many artist who could welcome some of the "creating work" to be automated. What part? Depends on the artists and his preferences. AI can take the burden of any type of work and leave those parts which are needed for the human to do. Human can choose what parts he will work on. That's the point.
Oooh this is gonna usher in a new wave of GPT wrappers!
If anyone's taking requests, could you do one that takes audio clips from podcasts and turns them into animations? Ideally via API rather than some PITA UI
Being able to keep the animation style between generations would be the key feature for that kind of use-case I imagine.
Apple vision pro + OpenAI entertainment on the fly + living in a tight pod next to millions of other people, hooked onto life support. A wonderful matrix fantasy
HN server runs smoothly and is having a walk in the park it seems - impressive compared to previous OpenAI annoucements. Has there been significant rollouts?
These looks fantastic. Very slight weirdness in some movement, hands, etc. But the main thing that strikes me is the cinematic tracking shots. I guess that's why they use "scenes". It doesn't seem like a movie could be generated with this involving actors talking.
Not that this isn't a leaps and bounds improvement over the state of the art, but it's interesting to look at the mistakes it makes - where do we still need improvements?
It "eats" several people with the wall part of the way through the video, and the camera movements are odd. Strange camera movements, in response to most of the prompts, seems like the biggest problem. The model arbitrarily decides to change direction on a dime - even a drone wouldn't behave quite like that.
I find creepy things in all the videos, despite their breathtaking quality at first glance. Whether it is the way the dog walks out into space or the clawlike hand of the woman in Tokyo, they are still uncanny valley to me. I'm not going to watch a movie made this way, even if it costs my $0.15 instead of $15.00. But I got tired of Avatar after watching it for 20 minutes. Maybe all the artificial abundance and intellectual laziness the generative AI world will make us realize how precious and beautiful the real world is. For my kids' sake, I hope so.
Sure, but imagine using this as a generative-fill to augment a movie, not just making an entire movie from it. We've seen fantastic homemade movies from very talented artists before. Now imagine if mostly talented artists could do it too.
This is the harbinger that announces that, as a technologist, the time has come for me to witness more and more things that I cannot understand how they work any more. The cycle has closed and I have now become my father.
How is that new? People built a gnomon, a stick was thrust into the soil and ta-da. No doubt it happened far before any writing system was out there. So it still took human quite some time to come with a compelling helio-centric model to cast some grabbable explanation of it all, even if you take Aristarchus of Samos as a pionner in this field.
Ok, maybe on some perspective I’m with you here. There are things happening no-one even those on the edge of the fringe can understand anymore how it works while it does. Or at least that is how it seems to be from my narrow perspective on AI.
On the other hand, I don’t feel like you need to know how a compiler work, let alone the hardware architecture it targets, before you can go through your first hello world program or even build some useful software on top of frameworks/library treated at blackboxes. So "I have no idea what I’m doing" in this perspective is probably as old as CS/informatics.
There's a huge difference between "I don't understand how X works", and "Nobody understands how X works".
Also, every single abstract is leaky, so often it's a difference between "I don't need to know how X works now", and "I can never find out how X works because it's simply not knowable".
My dad is 80 and willingly loves to listen to me explain how neural networks work, then he also read about them, busy beaver functions, kafka, and all kinds of crazy shit I tell him abour. This is all in your mind. You are as young as your mind is.
Not the original poster, but the more frightening part of the sentence, is the "not understanding how something works part" over the "becoming my father"
Getting to a point where realistically you're not able to know something deeply but then still use it is pretty frightening.
When I say deeply I don't necessarily mean that for every device you need to know about all of its atoms, but to have a pretty good framework for how the thing works deterministically, and how it can fail.
> Getting to a point where realistically you're not able to know something deeply but then still use it is pretty frightening.
This now applies to most things in modern industrial society. We operate our daily lives at a crazy high level of abstraction. I think for a lot of us on HN, we "know too much about what we don't know", and that is ... overwhelming.
Funny enough, most people are actually able to operate at these higher levels of abstraction without worrying too much, because they don't know enough about what they don't know.
> Not the original poster, but the more frightening part of the sentence, is the "not understanding how something works part" over the "becoming my father"
Thankfully it's nothing magical. But are you willing to learn about it or not?
Think about animation, how a program can generate a sequence of a bouncing ball between two key frames. Think about what defines a video. The frames right? From there I can try to imagine.
This is the key. I have enough curiosity to want learn the stuff from the ground up, just as I did with other technologies. But man do I have the stamina today? Not so sure!!!
Thanks for putting this into words. Its a very off-putting feeling for me, and couldn’t exactly figure out what that feeling was. It both scares me and excites me in a way that only makes me subconsciously anxious. Time to deep dive before I become what I always feared, which is being technologically left behind.
This is likely a wild guess on my part but i've faced a similar feeling lately. If this comes from the realm of Webdev, React, SSR and all the F'ing acronyms that we need to learn today and you want to feel like you've "caught on": My advice would be to avoid NextJS at all costs. It's too bleeding edge.
Opt for a sane option instead to get started, likely one of these: (Astro, SvelteKit or Remix).
Lol there's a massive difference between a framework that generates javascript, a language which has existed for 30 years at this point, and a magic LLM that no one on earth understands the internals of.
It's fascinating that it can model so much of the subtle dynamics, structure, and appearance of the world in photorealistic detail, and still have a relatively poor model of things like object permanence:
What’s the connection between this and high end game engines (like unreal 5). I would expect 3d game engines to be used at least for training data and fine tuning. But perhaps also directly in the generation of the resulting videos?
For example this looks very much like something from a modern 3d engine:
They almost certainly trained on video game output and this is clearly bleeding into the style of some of these demos.
The SUV video for example looks very much like something you'd see in a modern video game which probably makes sense because most videos with kind of perspective are going to be from video games.
I don't know how they would use game engines directly for training and fine tuning though. It would be far too labour intensive to render high quality scenes using a video game engine for every prompt.
Does OpenAI hang out with these kinds of features in their back pocket just waiting for a Gemeni announcement so they can wait an hour and absolutely dunk on Google?
Why are you able to have a fun job, when another human has a non-fun job? Because you're more talented and have skills they lack. Same goes for AI versus you. You're just starting to feel what billions of other people have felt, for a long time.
We're both saying "the current system is bad because the way it works will interact with ai to create negative outcomes" and you're saying "wow you're very stupid, here's how the system works." We're aware friend, that's the problem.
Yeah, it would be way better if they just released it right away, so that political campaigns can use AI generated videos of their opponents doing horrible/stupid things right before an election and before any of the general public has any idea that fake videos could be this realistic.
you joke, but the hobbling of these 'safe' models is exactly what spurs development of the unsafe ones that are ran locally, anonymously, and for who knows what purpose.
someone really interested in control would want OpenAI or whatever centralized organization to be able to sift through the results for dangerous individuals -- part of this is making sure to stymie development of alternatives to that concept.
Everytime OpenAI comes up with an new fascinating gen model it also allows for that bluntly eye-opening perspective on what flood of crappy und unnecessary content we have been gotten accustomed to being thrown at us. Be it blown-up text description and filler talk, to these kind of vodka-selling commercial videos.
It's a nice cleansing benefit that comes with these really extraordinary tech achievement that should not be undervalued (after all it produces basically an endless amount of equally trained producers like the industry did in a - somehow malformed - way before).
Poster frames and commercials thrown at us all the time, consumed by our brains to a degree that we actually see a goal in producing more of them to act like a pro. The inflationary availability that comes with these tools seems a great help to leave some of this behind and draw a clearer line between it and actual content.
That said, Dall-E still produces enough colorful weirdness to not fall into that category at all.
I see many possibilities for commercials, demos... not to mention kids' animations, of course.
Actually, thinking of this from the perspective of a start-up, it could be cool to instantly demonstrate a use-case of a product (with just a little light editing of a phone screen in post). We spent a lot of money on our product demo videos and now this would basically be free.
how will the AI know what your product looks like? You probably already have CAD models, couldn't you import those into blender and make something in an afternoon or two?
> how will the AI know what your product looks like?
Training an embedding/LoRA on the product and using it with the base model, same as is done for image-generation models (video generation models usually often use very similar architecture to image generation models -- e.g., SVD is a Stable Diffusion 2.x family model with some tweaks.)
Now, you may not be be able to do this with Sora when OpenAI releases it as a public product, just like you can't with DALL-E. But that's a limitation of OpenAI's decisions around what to expose, not the underlying technology.
Value is going to be higher for professions where the human essence is an essential component of the function. Or professions that are more coupled with physical reality…my hedge is probably becoming an electrician.
I’d imagine IRL no-tech experiences will be the new ‘escapes’ too.
Maybe I’m too idealistic about the importance of the human spirit/essence…whatever that actually is.
It's interesting how a lot of the higher frequency detail is obviously quantized. The motion of humans in the drone shots for example is very 'low frequency' or 'low framerate', and things like flowing ocean water also appears to be quantized. I assume this is because of the internal precision of these models not being very high?
I used to think a few years ago that virtual reality/ai projects such as the mataverse wouldn't amount to anything big. I even thought of them ridiculous. Even recently, I thought that GPT's and ai generated images would be the pinnacle of what this new ai wave would amount to. I just keep getting baffled.
If you draw a line from Pong (1972, or 52 years ago) to Sora, what does that imply for the quality and depth of simulations in 2076 (52 years in the future)?
Would we be able to perceive the differences between those and the physical world? I can't help but feel like there is a proof for the simulation theory possible here.
The idea that prompting is a creative tool is utterly illogical.
This will result in a ton of mediocre synthetic crap for corporate presentations and porn generating.
Contrary to the trends in SV, dehumanization of creative professions will result not in productivity boost but in utter chaos and as a result will add more time loss in production process.
I never liked Sam Altman in his Y years, now I know why.
Even with the "blessings" from the "masters" in Davos/Bilderberg, a bad idea is a bad idea. Maybe this will push World ID as a result, but is it necessary?
The current trends in tech are not producing solutions for a professional problem. With rare exceptions, this looks more and more as removing of human input and normalization of a society ruled by AI at any cost. So sad.
I've always been a digital stills guy, and dabbled in video.. as a hobby.
As a hobbyist, I always found the hardest thing is making something worth looking at. I don't see AI displacing the pleasure of the art for a hobbyist.
My next guess is the 80/20% or 95/5% problem is gonna be stuff like dialogue matching audio and mouth/face motion.
I do see this kind of stuff killing the stock images / media illustrator / b-roll footage / etc jobs.
Could a content mill pump out plausibly decent Netflix video series given this tool and a couple half decent writers.. maybe? Then again it may be the perpetual "5 years away". There's a wide gap between generating filler content & producing something people choose to watch willingly for entertainment.
Wow - "All videos on this page were generated directly by Sora without modification."
The prompts - incredible and such quality - amazing. "Prompt: An extreme close-up of an gray-haired man with a beard in his 60s, he is deep in thought pondering the history of the universe as he sits at a cafe in Paris, his eyes focus on people offscreen as they walk as he sits mostly motionless, he is dressed in a wool coat suit coat with a button-down shirt , he wears a brown beret and glasses and has a very professorial appearance, and the end he offers a subtle closed-mouth smile as if he found the answer to the mystery of life, the lighting is very cinematic with the golden light and the Parisian streets and city in the background, depth of field, cinematic 35mm film."
If "Given that everyone's creativity isn't top notch, highest quality will be limited to a the best", that implies the existence of professionals, which implies work.
I meant those that are proficient/creative enough to be creating top content using AI but if we take it further to AI using AI, then yes, its AI all the way down.
We won't and the world will go into a massive depression, destroying the market for AI produced garbage and staving off global warming for a few extra years in the process. So even better than UBI.
Today we scroll social media feeds where every post we see is chosen by an algorithm based on all the feedback it gets from our interactions. Now imagine years down the road when Sora renders at 60 fps, every frame influenced by our reaction to the prior frame.
With the third and last videos (space men, and man reading in the clouds), this is the first time I have found the resolution indistinguishable from real life. Even with SOTA stills from Midjourney and Stable Diffusion I was not entirely convinced. This is incredible.
This is super cool. So many innovations come to mind. But it makes me wonder what will come from having the ability to virtually experience anything we want. It'll take a while, but I'm hoping we'll eventually want to go outside more instead of less.
What the f. What. I'm no AI pessimist by any means but I thought there are some significant hurdles before we get realistic, video generation without guidance. This is nothing short of amazing.
It's doubly amazing when you think that the richness of video data is almost infinitely more than text, and require no human made data.
The next step is to combine LLM with this, not for multimodal, but to team up together to make a 'reality model' that can work together to make a shared understanding?
I called LLMs 'language induced reality model' in the past. Then this is 'video induced reality model', which is far better at modeling reality than just language, as humans have testified.
What happens when humanity stops generating new content/recording new findings/knowledge etc ? are at a place where whatever we had is enough knowledge for AI takeover?
As a counterpoint, i don't think that the average person has stopped taking pictures just because image generation models exists. Nor have people stopped pursuing other hobbies impacted by AI. We don't go to museums to look at AI art that was created in 10 seconds and I doubt culture will shift to a point where that's common place. Human content will still be created, and we will probably see the general quality of that content increase as a result of foundational models. Content creation is taking whats in the mind and translating it into the physical/digital realm. With better AI, this translation becomes easier for a lot of fields and you no longer have to master the use devices to make your art quality. However, everyone can agree that prompt based generation is a lot less satisfying than making content from scratch. It feels more akin to a google search than a satisfying creative process. Those who are passionate and talented will continue to pursue their physical medium because of this.
The monetary value of generic stock content will surely drop and won't be created by professionals anymore. However, that doesn't mean people stop taking pictures of their dog just because they can get midjourney to generate the same thing. Creation for the sake of creation will continue. AI companies will initially reap in a lot of the $ value that used to go to the creators of stock content, but when open source models reach parity the masses will be able to make what's in their mind a reality as casual creators. Hobbyists will still exist and those that become truly great will still rise to notoriety.
I wonder what this tech would do using a descriptive fragment from a book. I don't read many books at all but I would spend some time feeding in fantasy fragments and see how much they differ from what I imagined.
It's odd how the model thinks "historical footage" could be done by drone. So it understands that there should be no cars in the picture. But not that there should be no flying perspective.
This is machine simulated art. It is not a convincing simulation to videographers, yet it pleases software architects and other non-visual artists. Aptitude for visual art making provokes envy in some who lack it. The drive to simulate art is almost as common as the desire to be recognized as a capable visual artist. The most interesting generative art I’ve seen does not attempt verisimilitude. Children want their art to look real. Verisimilitude is hard, especially for children and quasi AI.
What goes around, comes around. I'm glad this is happening. Gitty and friends should be driven out of business for their absurd stunt they pulled with image search.
Another step in the trend of everything becoming digital (in film and otherwise). It used to be that everything was done in camera. Then we got green screens, then advanced compositing, then CGI, then full realistic CGI movies modeled after real things and mocap suits. Now we're at the end game, where there will be no cameras used in the production of a movie, just studios of people sitting at their computers. Because more and more, humans are more efficient at just about anything when aided by a computer.
Hopefully we will see AIs with tools which are not "paint" or "notepad", but a maths formal proof solver, etc.
But I have a problem: I am unable to believe the videos I saw were dreamt by AI. I can feel deeply that I do believe there is some trickery or severe embellishment. If I am wrong, I guess we are at an inflexion point.
I can recall 10+ years ago, we were talking "in hacking groups" about AI because we thought the human brain alone was not good enough anymore... but in a maths/sciences context.
Visual sharpness at the expense of wider-scale coherence (see: sliding/floating walking woman in Tokyo demo or tiny people next to giant people in Lagos demo) seems to be a local optimum consistently achieved by today's SOTA models in all domains.
This is neat and all but mostly just a toy. Everything I've seen has me convinced either we are optimizing the wrong loss functions or the architectures we have today are fundamentally limited. This should be understood for what it is and not for what people want it to be.
>Visual sharpness at the expense of wider-scale coherence (see: sliding/floating walking woman in Tokyo demo or tiny people next to giant people in Lagos demo)
Wider-Scale coherence is still much better than previous models and has consistently been improving. It's not "visual sharpness at the expense of coherence". At worst, the models are learning wider-scale coherence slower.
Not everything is equally difficult to learn so it follows that some aspects will lag behind others. If coherence weren't improving you might have a point but it is so...
Scaling laws operate in the limit but eventually practical considerations dominate. There's a lot we haven't yet fully appreciated about biological vision and cognition -- and indeed, common sense as regards sensible video generation and processing -- that have not made their way into this kind of model. NeRFs are interesting and I hope to see more from that side of things in the coming months and years.
Yes and in that time we've learned some important lessons that it would be unwise to ignore, e.g. comprehension of 3D geometry despite 2D input visual data.
If I put goalposts ahead of the ball every time I kick it, can I be said to be accurate? If we don't specify what the actual goal is before calling something an improvement, can we say we are doing anything meaningful?
Who owns a person’s likeness? Now that we’re approaching text to video of a quality that could fool an average person, won’t this just open a whole new can of worms if the training models are replicating celebrities? The ambiguity around copyright when something on paper is in the style of seems to fall into an entirely separate category than making AI generated videos of actual people without their consent. Will people of note have to get a copyright of their likeness to fight its use in these models?
No need to take the bet, reality is already there. Miku is the endgame for idols. Forever young. Will never have a boyfriend. Always follows the script, or not when the team managing her decides they need a little drama. etc. etc. etc.
I do wonder why OpenAI chose the name "Sora" for this model. AI is now going to have intersectionality with Kingdom Hearts. (Atleast you don't need a PhD to understand AI.)
Sora means sky in Japanese, their reasoning is akin to "the sky's the limit".
> The team behind the technology, including the researchers Tim Brooks and Bill Peebles, chose the name because it “evokes the idea of limitless creative potential.”
Very late so probably invisible to all, but is this just a byproduct of OpenAI's work on understanding of video input? The Google Gemini presentation video suggested that this is the next step-level of AIs. Already with GPT-4V, being able to converse with an AI about the contents of an image feels surreal. The applications that become possible with an AI that can just look at video streams are incredible.
All the examples feel so familiar, like I have seen them all before buried in the depths of YouTube and long-forgotten BBC documentaries. Which I guess is obvious knowing roughly how the training works.
I guess what I'm wondering is how "new" the videos are, or how closely do they mimic a particular video in the training set? Will we generate compelling and novel works of art with this, or is this just a very round-about way of re-implementing the YouTube search bar?
Also interesting that some of the examples ignore details in the prompts. No clouds or sun in the sky, no depth of field, their hair isn't blowing in the wind.
RE worrying about the future: what concerns me most is post-truth reality. Being thrown into a world where it's impossible to tell fact from fiction is insane and dangerous. Just thinking about it evokes paranoia.
We're nowhere near full-automation, these are growing pains, but maybe the canary in the goldmine for the job market. Expect more enthusiasm for UBI or negative tax and the like and policies to follow. Cheap energy is also coming eventually, just slower.
It is honestly quite concerning just how good these videos look.
Like you can see some weird artifacts, but take one of these videos, compress it down to a much lower quality and with the loss of quality you might not be able to tell the difference based on these examples. Any artifacts would likely be gone.
Given what I had seen on social media I had figured anything remotely real was a few years away, but I guess not...
I guess we have just stopped worrying about the impact of these tools?
If we go from DALL-E 3, it won't be nowhere near competitive while they have the superior ground. Generating a high quality 1024x1024 image with costs around ~$0.002, but $0.08 on DALL-E 3 (20x more expensive per-image). For videos with very high computational needs (since each frame needs to be temporally consistent, you need huge GPUs to serve this) I'm expecting this to be so much more expensive than its competitors (Pika or SVD1.1)
This is amazing and was to be expected. Are there any good solutions that can be used to prove a video is not generated? I guess in some ways we have come full circle and are back to trusting individual journalists and content creators I just did'nt think it would happen this fast.
The gold rush scene is the most captivating to me. The film style looks like it's from the 70's/80's (reminds me of Little House on the Prairie), but the footage is from a drone standpoint. I find it magically immersive in a time when none of the technology to make the shot would have existed.
The rendering of static on the TVs is interesting/strange. Must be hard for AI to generate random noise:
Video 7 of 8 on the 2nd player on the page.
> Prompt: The camera rotates around a large stack of vintage televisions all showing different programs — 1950s sci-fi movies, horror movies, news, static, a 1970s sitcom, etc, set inside a large New York museum gallery.
Videos don’t feel real though this is best thing I have ever seen on topic ‘text-to-video’. I am sure this will go so far and become more realistic. But does this mean that we will not hire actors and creators but we will hire video editors who can stitch all together and prompt writers who can create tiny videos for story.
The technical report mentions it the training data was fed at up to 1920x1080 (allowing for a vertical 1080x1920 as well) so I'd guess that's why all of these videos were 1080p or lower, any larger and it probably gets wonky fast. I didn't see anything on absolute compute requirements and their impact on time to generate though.
"so far ahead"
"leaps and bounds beyond anything out there"
"This is insane"
Let's temper the emotions for a second. Sora is great, but it's not ready for prime time. Many people are working on this problem that haven't shared their results yet. The speed of refinement is what's more interesting to me.
I wonder how much of a blocker to real use not having things like model rigging or fine-tuned control over things will be to practical use of this? Clearly it can be used in toy examples with extremely impressive results, but I'm not entirely convinced that, as is, it can replace the VFX industry as a whole.
Holy cow, I've literally only looked at the first two videos so far, and it's clear that this absolutely blows every other generative video model out of the water, barely even worth comparing. We immediately jumped from interesting toy models where it was pretty easy to tell that the output was AI generated to.. this.
This really seems like "DALL-E", but for videos. I can make cool/funny videos for my friends, but after a while the novelty wears off.
All of the AI generated media has this quality where I can immediately tell that it's ai, and that becomes my dominant thought. I see these things on social media and think "oh, another ai pic" and keep scrolling. I've yet to be confused about whether something is ai generated or real for more than several seconds.
Consistency and continuity still seem to be a major issues. It would be very difficult to tell a story using Sora because details and the overall style would change from scene to scene. This is also true of the newest image models.
Many people think that Sora is the second coming, and I hope it turns out to have a major impact on all of our lives. But right now it's looking to have about the same impact that DALL-E has had so far.
Yeah, you really have to fast-forward 5 to 10 years. The first cars or airplanes didn't run particularly well either. Soon enough, we won't be able to tell.
I'd love to feel excited by all these advancements and somehow I feel numb. I get part of the feeling (worry about inequalities it may generate), but I sense something more. It's like I see it as a toy... I'm unable to dream on how this will impact my life in any meaningful way.
Imagine dumping all the HIPAA data into a process like this. Obviously fraught with privacy and accuracy[0] concerns. Nonetheless, it might help us move some things forward.
What makes OpenAI so far ahead of all of these other research firms (or even startups like Pika, Runway, etc.)? I feel like I see so many examples of fields where progress is being made all across and OpenAI suddenly swoops in with an insane breakthrough lightyears ahead of everyone else.
The scene of the train, could easily be used in a transition scene in a movie, like theres so much here like stock videos are gonna be f*cked in short order, and if they add composition and planning tools, and loras, so will the movie industry.
- Local/Bespoke high quality video content creation by ordinary Joes: Check.
- Ordinary joes making fake porn videos for money: Check.
- Reduce cost for real movies dramatically by editing in AI scenes: Check.
In the future, we're not going to have common tv shows or movies. We'll have a constantly evolving stream of entertainment that's perfectly customized to the viewer's preferences in real time. This is just the first step.
In the last few days I've been asking myself what would drive the next big leap in advertising efficiency after big data and conversion pixels. I think I have my answer now. This is going to disrupt the ad agency side of the business big time.
This is very impressive. I know in general people are iffy about research benchmark. How does it work to evaluate text-to-video types of use cases? I want to have some intuition on how much this is better than other systems like pika quantatively.
Wonder how the folks at Runway and Pika are thinking about this.
To me, it's becoming increasingly obvious that startups whose defensibility hinges on "hoping OpenAI doesn't do this" are probably not very enduring ones.
Technically breathtaking, but why do these examples of AI-generated content always have a cheap clipart vibe about them? So naff and uninspired given the, no doubt, endless potential this technology has.
I also feel a sense of dread too. Imagine the tidal wave of rubbish coming our way. First text, then images and now video can be spewed out in industrial quantities. Will it lead to a better culture? In theory it could, in practice I just feel like we'll be deluged with exponentially more mediocre "content" .
The results are mindblowing, to say the least. But will they allow developers to fine-tune this eventually? OpenAI is still yet to give that ability to txt2img DALLE models, so I doubt that will be the case.
Where is the link to try it, ChatGPT doesn't know anything about it:
"Sora" is not a video generation technology offered by OpenAI. As of my last update in April 2023, OpenAI provides access to various AI technologies, including GPT (Generative Pre-trained Transformer) for text generation and DALL·E for image generation. For video generation or enhancement, there might be other technologies or platforms available, but "Sora" as a specific product related to OpenAI or video generation does not exist in the information I have.
If you're interested in AI technologies for video generation or any other AI-related inquiries, I'd be happy to provide information or help with what's currently available!
They're attaching metadata to the videos which can be easily removed. Aren't there techniques to hash metadata into the content itself? I.e. such that removing the data would alter the image.
It's a revolutionary thing, but I'll reserve my judgment until I see if it can handle the real challenge: creating a video where my code works perfectly on the first try.
OpenAI: Prompt: The camera follows behind a white vintage SUV with a black roof rack as it speeds up a steep dirt road surrounded by pine trees on a steep mountain slope...
Wow. And just like that fliki.ai and similar products have been sherlocked. Great time to be a creator, not the best time to be a product developer, production designer
I am very uncomfortable with this being released commercially without the requisite defence against misuse being also accessible. If we didn't have a problem with deepfakes, spam, misleading media before, we surely are now. All leading AI organisations are lacking here, benefiting from the tech but not sufficiently attacking the external costs that society will pay.
Something like a watermark (doesn't necessarily have to be visible to people) and a tool to detect that watermark might be nice for example. Or alternatively we could stop developing this hell technology and try to automate something that isn't cultural expression.
Two things I would like - advances in detectors or generative content that do not do c2pa, and more transparency in what the usage policy means in practice.
This is bananas. This is ahead of anything else I've seen. The entire stock footage industry may be shut down over night because of something like this.
And it is still not perfect. Looking at the example of the plastic chair being dug up in the desert[1] is frankly a bit... funky. But imagine in 5 or even 10 years.
Just in time for the election season. Also "A cat waking up its sleeping owner demanding breakfast" has too many paws - yes I do feel petty saying this.
And the sleeper's shoulder gets converted to the duvet? And a strange extra hand somewhere. It was also the one that to me stood out as the worst. The quality was good, but it had the same artifacts as previous generations of ai videoes where thing morphs.
This inside VR goggles would make it amazing. probably it wouldnt even need to render 360, it would generate it on demand. I better go get some feeding tube
That's the difference between Donkey Kong Country and the N64 (or perhaps between Pixar and Quake).
The amount of power needed to generate this can't be feasible for real time VR today. There's a reason even the company that invented (massive and free) Gmail is charging for its top tier generative AI.
Silver lining in this I guess. If everyone realizes at the same time they're all f'd together, regardless of "skill", then maybe there's a chance we can all work together to save ourselves.
No chance to think "sucks for you, but I'm good here" like so often happens with other issues.
I find the watermark at the bottom right really interesting at first it looks like random movement and then in the end it transforms into the OpenAI logo
Almost certainly troves of stock footage. The type of exaggerated motion seen in these examples is very reminiscent of stock footage. And it is heavily textually annotated for search.
One one side, we have people who are upset because the creators of the videos in the dataset used for teaching this language model were not compensated.
On the other hand, people find the tech very impressive and there are a lot of mind blowing use-cases.
Personally, this opens up the world for me to create video ads for software projects I create, since I have no financial resources or time to actually make videos, I only know how to code. So I find it pretty exciting. It's great for solo entrepreneurs.
I honestly expected video generation to get stuck at barely consistent 5 second clips without much movement for the next few years. This is the type of stuff I expected to maybe be possible towards the end of the decade. Maybe we really are still at the bottom of the S curve which is scary to think about.
It's been said a thousand times, but the "open" in openai becomes more comical every day. I can't imagine how much money they will generate from such a tool, and I'm sure they will do everything possible to keep a tight lid on all the implementation details.
No, corporate announcements are very much planned in advance. There's a lot of coordination that has to happen. This is just coincidence, unless one of the companies had inside information about the other's announcement and timing. But that's pretty unlikely.
Looked at the first clip and immediately noticed the woman's feet swap at ~15 seconds in. My eyes were drawn to the feet because of the extreme supination in her steps.
Looks like a dramatic improvement in video generation but still a miss in terms of realism unless one can apply pose control to the generated videos.
quite the technical feat I suppose, but the actual result is nightmare fuel -- legs swapping places, people walking into simulacrum of spaces -- just deeply unsettling uncanny valley stuff
These are insanely good, but there are still some things that just give them away (which is good, imo.) Like the Tokyo video is amazing, the reflections, etc are all great, but the gaits of people in the background and how fast they are moving is clearly off. It sticks out once you notice it. These things will obviously improve as time marches on.
The fear I have has less to do with these taking jobs, but in that eventually this is just going to be used by a foreign actor and no one is going to know what is real anymore. This already exists in new stories, now imagine that with actual AI videos that are near indistinguishable from reality. It could get really bad. Have an insane conspiracy theory? Well, now you can have your belief validated by a completely fictional AI generated video that even the most trained eyes have trouble debunking.
The jobs thing is also a concern, because if you have a bunch of idle hands that suddenly aren't sure what to believe or just believe lies, it can quickly turn into mass political violence. Don't be naive to think this isn't already being thought of by various national security services and militaries. We're already on the precipice of it, this could eventually be a good shove down the hill.
Real AGI is farther away than I think people think, and the tendency for mankind to destroy itself is much better demonstrated than machines doing that even when that time comes.
I'm just blown away. This can't be real. But lets be face the truth.
Its even more impressive than ChatGPT. I think its the most impressive AI tech i've seen till now.
I'm speechless.
Now the big question is. As OpenAI keeps pushing boundaries, it's fascinating to see the emergence of tools like Sora AI, capable of creating incredibly lifelike videos. But with this innovation comes a set of concerns we can't ignore.
So i'm worried about getting these tools misused.
I'm thinking about what impact could they have on the trustworthiness of visual media, especially in an era plagued by fake news and misinformation? And what about the ethical considerations surrounding the creation and dissemination of content that looks real but isn't?
And, what we should do to tackle these potential issues? Should there be rules or guidelines to govern the use of such tools, and if so, how can we make sure they're effective?
Probably we humans will come to a point where we wouldn't even bother ourselves with making videos. We may just consume based on our emotional state on the fly generated by such services.
I'm just blown away. This can't be real. But lets be face the truth. Its even more impressive than ChatGPT. I think its the most impressive AI tech i've seen till now. I'm speechless.
Now the big question is. As OpenAI keeps pushing boundaries, it's fascinating to see the emergence of tools like Sora AI, capable of creating incredibly lifelike videos. But with this innovation comes a set of concerns we can't ignore.
So i'm worried about getting these tools misused. I'm thinking about what impact could they have on the trustworthiness of visual media, especially in an era plagued by fake news and misinformation? And what about the ethical considerations surrounding the creation and dissemination of content that looks real but isn't?
> Sora serves as a foundation for models that can understand and simulate the real world, a capability we believe will be an important milestone for achieving AGI.
The 3D consistency of those videos is insane compared to what has previously been done, they must have used some form of 3D regularization with depth or flow I think
Apple Vision Pro VR + unlimited, addicting... I mean, engaging video feed into your eyes. The machines will keep you tube fed and your bowels emptied. Woe to the early 21st century techno-optimism. An alien intelligence rules the galaxy now. Welcome to the simulation.
Then change the prompt? It is a demo afterall.
From a creators perspective, those shots are awesome for inspiration and / or a tool to create something bigger.
It's always kinda crazy to me to see an emerging technology like this have it's next iteration in the development pipeline, and even after seeing the First Gen AI video models, even many of the HN people here still say, "Meh, not that impressive."
Brother, have you seen Runway Gen 2, or SVD 1.1? I'm not excited about Sora because I think it looks like Hollywood animations, I'm excited because an open-source 3rd-Gen Sora is going to be so much better, and this much progression in one step is really exciting!
This is impressive and amazing. I can already see a press release not too far down the road: "Our new model HoSapiens can do everything humans can do, but better. It has been specifically designed to deprecate humanity. We are working with red teamers — domain experts in areas like union busting, corporate law and counterinsurgency, plus our habitual bias, misinformation, and hateful content against AI orange team— who will be adversarially testing the model.
This is going to make the latest election really interesting (and scary). Is anyone working to ensure a faked video of Biden that looks plausible but is AI generated doesn't get significant traction at a critical moment of the election?
That just doesn't seem like a plausible scenario to me. Obviously, if such thing happened, Biden would have an alibi, since it's known where he is at all times.
The people who already hate Biden, probably already think he's doing some weird shady stuff, and would point to some conspiracy. The people who like Biden, would accept the alibi.
Ultimately it wouldn't move the needle.
What is concerning, is the technology being used against a regular person, who may not have an alibi.
right, we all new AI would be closer to realization in 2020. of course the first one to do it is some complete sellout asshole, affirming hateful rhetoric like "we have to make thing safe", which is just thinly veiled pro police state sentiment. every single thing you can come up with why this is "unsafe" is just police state mentality.
"porn without consent" - thought crime
"too much porn of whatever you dream of" - yes, conservatives (50% of USA) actually think this is a problem
"spam" - advancing the closed garden model email is heading towards. soon you will simply need government id to make email even though there are plenty of alternative ways to do communication aside from email which was already considered insecure and a bad protocol in 2000. this has nothing to do with AI but they are still acknowledging this absurdity by framing AI as the enabler of that.
"automated social engineering" - just weaponizing the ignorance the bad thought leaders of the industry left us. instead of giving us proper authentication methods, we still have "just send my photo id to these 33 companies, which will ask for it in random ways we dont expect and just have to trust them"
"copyright" - literally not a problem, almost nothing "protected" by copyright matters and the law is just used by aggressive capitalists to shove their products down everyone's throat
"ICBMs being automatically hacked and launched at people" - just stop being bad government and hiring completely uncredible people to implement every mission critical control system while hooking it up to the internet
"racist bias" (or whatever) - this is the dumbest fucking thing i've ever heard of
this website is a perfect snapshot of why tech sucks so hard. its dressed up like cinematic film using a ton of js libs and css hacks or god knows so it can only be viewed smoothly on the latest computer hardware. only on one of the big 3 browsers that each had a trillion man hours of pointless iterations driven by digital graphics marketing companies. and on top of that they have a nice professional tone made by $300K/year PR people. please, sincerely, fuck off.
Here is my prediction of how this will play out for the entertainment industry in the coming decades:
Phase 1 (we are here now): While generative AI is not good enough to directly produce parts of the final product, it can already be used to quickly prototype styles, stories, designs, moods, etc. A good chunk of the unnamed behind-the-scenes-people will loose their job.
Phase 2: While generative AI is still expensive, the output quality is sufficient to directly produce parts of / the entire final product. Big production outlets will use it to produce AAA titles and blockbusters. Even actors, directors and other high publicity positions will be replaced.
Phase 3: The production cost will sink further until it becomes attainable by smaller studios and indie productions. The already fierce markets will be completely flooded with more and more quantity over quality. Advertisement will not be pre-produced and cut into videos anymore but become very subtle product placements, impossible for ad-blockers to strip from the product.
Phase 4: Once the production cost falls below the price of one copy of the product, we will get completely customized entertainment products tailored to our personal taste. Online communities will emerge which craft skeletons / templates which then are filled out by the personal parameter sets of the consumers. That way you can still share the experience with friends even though everybody experiences a different variation.
Phase 5: As consumers do not hit any production limits any more (e.g. binge watch their favorite series ad infinitum) and the product becomes optimized to be maximally addictive by measuring their reaction to it, it will become impossible for most human beings to resist. The entertainment mania will reach its peak and social isolation, health issues and economic factors will bring down the human reproduction rate to basically zero.
Phase 6: Human civilization collapsed within one or two generations and the only survivors will be extremely technology-adverse people by selection. AGI might have happened in the meantime but did not have the time to gracefully take over and remodel the human infrastructure to become self sufficient. Instead a strict religion will rule the lands and the dark ages begin anew.
Note that none of this is new, it is just the continuation and intensification of already existing trends. This is also not AGI doomerism as it does not involve a malicious AGI gone rouge or anything like that. It is simply what happens when human nature meets powerful technology.
TLDR: While I love the technology I can only see very negative long-term outcomes from this.
As several others have pointed out, realism of these models will continue to improve, and will soon be economically useful for producing beautiful or functional artifacts - however prompt adherence (getting what you want or intend) of the models is growing much more slowly.
However I think we have a long ways to go before we'll see a decent "AI Film" that tells a compelling story - and this has nothing to do with some sort of naturalistic fallacy that appeals to some innate nature of humans!
It comes down to the dataset and the limits of human creators in their ability to communicate their process. Image-Text and Video-Text pairs are mostly labeled by semi-skilled humans who describe what they see in detail. They are, for the most part, very good at capturing the obvious salient features of an image or a video. "reflections of the neon lights glisten in the sidewalk". However, what you see in a movie scene is the sum total of dozens if not hundreds of influences, large and subtle. Choices made by the actors, camera operators, lighting designers, sound designers, costuming, makeup, editors, etc... Most people are not trained to recognize these choices at all, or might not even be aware that there are choices to make. We (simply) see "Joaqin Phoenix is making awkward small-talk in the elevator with other office workers".
So much of what we experience processes on subconscious and emotional and purely sensory levels, we don't elevate those lower-level qualia to our higher-brain's awareness and label them with vocabulary without intentional training (such as tasting wine, coffee, beer, etc - developing a palate is an act of sensory-vocabulary alignment).
However, despite not raising these things to our intentional awareness, it has an influence on us -- often the desired impact of the person who made that choice in the first place. The overall effect of all of these intentional choices makes things 'feel right'.
There's no fundamental reason AI can't produce an output that has the same effect as those choices, however finding each little choice is like a needle in a haystack. Accurate labeling of the training data tells the AI where to look -- but the people labeling the data are probably not well-versed in all of the little intentional choices that can be made when creating a piece of video-media.
Beyond the issue of the labeling folks being trained in the art itself, there's the problem too of the artists themselves not being able to fully articulate their (numerous, little, snowflake-into-avalanche) choices - or simply not articulating it even if they could. Ask Jackson Pollock about paint viscosity and you'll learn a great deal, but ask about abstract painting composition and there's this ineffable gap that language seems ill-suited to cross. The painter paints what they feel, and they hope that feeling is conveyed to the viewer - but you'd be hard pressed to recreate "Autumn Rhythm (Number 30)" if you had to transmit the information via language and hope they interpreted it correctly. Art is simultaneously vague and specific!
So, to sum up the problem of conveying your intent to the model:
- The training data labels capture obvious or salient features, but not choices only visible to the trained eye
- The material itself is created by human artists who might not even be able to explain all of their choices in words
- You the prompter might not have the vocabulary that captures succinctly and specifically the intended effect
- The end result will necessarily be not quite what you imagined in your mind's eye as a result of all of this missing information
You can still get good results if you tell it to copy something, because the label "Tarantino" captures a lot of detail, even all the little things you and the training data would never have labeled in words. But it won't be yours and - until we have an army of trained artists providing precise descriptions for training data in their area of expertise, and you know how to speak those artists' language - it can't be yours.
Call me whatever you want, but this technology should not exist.
People to just create lifelike videos of anything they can put their mind to, is bound to lead to the ruining of many peoples' lives.
As many people that are aware and interested in this technology, there is 100x people who have no idea, don't care or can't comprehend it. Those are the people that I fear for. Grab a few pictures of the grandkids off of facebook, and now they have a realistic ransom video to send.
Am i being hyperbolic? I don't think so. Anything made by humans can be broken. And once its broken and out there, good luck.
You mean like they stopped trusting the Internet, or YouTube videos, or newspapers, or old broadcast TV news? Except they didn't, because it's impossible to live life successfully without information sources beyond one's eyes and ears.
uhhh… i would respond differently to your rhetorical question…
there’s never been greater distrust of legacy media, and the fact that you can’t trust everything you read on the internet has been a trope for decades
Maybe it's a useful thing to ponder why faked photoshopped pictures were never a big problem in human life. I think maybe it's because we use a lot of pictures in our lives, sure. But ultimately we have so much context that a fake would be easy to detect and therefore irrelevant. At most people used photoshop to alter images of documents.
Becoming good enough at Photoshop to do a convincing face swap was something that took a lot of time and skill. Not everyone with a copy of Photoshop had the ability to create a compromising photo of a politician, for example.
Alcohol production didn't require massive amounts of funding, energy and compute power. Any shmuck could make moonshine in their bathtub. Shut down OpenAI and make their racket illegal, and who's going to have the resources to continue their work?
You're right, there's a very real danger that bad actors within OpenAI could hand their research to China. But that's not an inevitability. We've managed to block certain countries from developing nuclear weapons technology. We can do it with this too.
I wasn't even thinking about an OpenAI defector releasing secrets. I think it's a matter of research and compute power, both of which China is increasing in these sectors so they don't trail behind the USA.
If we stop developing our tech, China will continue advancing even without leaks from OpenAI, and eventually they will develop better models, even if it takes another decade. Banning something only to cower in fear of ourselves while we watch someone else use it to destroy us seems like poor planning.
Yeah, that's why we have school shootings every day in Europe and Australia. Oh wait, we don't. Banning might not work well for some things, but this can totally be banned. Your comment is a blatant misrepresentation of the effectiveness. At best. At worst, it's willful undermining of democracy.
>>Yeah, that's why we have school shootings every day in Europe and Australia. Oh wait, we don't. Banning might not work well for some things, but this can totally be banned.
Sure, now let's talk about knife wounding and acid attacks...
The fundamental issue of human violence still exists.
This will probably cost some downvotes, but can we start a thread explaining the architecture behind this for this interested in how it actually works?
"We’ll be taking several important safety steps ahead of making Sora available in OpenAI’s products. We are working with red teamers — domain experts in areas like misinformation, hateful content, and bias — who will be adversarially testing the model."
- To make sure that the perfectly unbiased algorithms are biased against bias. So in essence, red teamers as in commies I suppose.
Does anyone know how to handle the depression/doom one feels with these updates?
Yes, it's a great technical achievement, but I just worry for the future. We don't have good social safety nets, and we aren't close to UBI. It's difficult for me to see that happen unless something drastic changes.
I'm also afraid of one company just having so much power. How does anyone compete?
>Does anyone know how to handle the depression/doom one feels with these updates?
Realize that it's a choice to respond to things this way. This feeling comes from a certain set of assumptions and learned responses.
Remember that people are bad at predicting the future. Look at the historical track record of people predicting the implications of technological advancement. You'll find that almost nobody gets it right. Granted, that sometimes means that things are worse than we expect, but there are also many cases where things turn out better than we expect. If you're prone to focusing on potential negatives, maybe you can consciously balance that out by forcing yourself to imagine potential positives as well.
Try to focus on things you personally have control over. Why worry about something that you can't change? Focus on problems that you can contribute to solving.
I've been personally affected by technology advancements, and had to spend lots of time and effort recovering professionally from it. Mind you, I'm not saying it cannot be done, but those that do get affected have to work harder than those that don't.
It's easy to say "don't worry" if you haven't been affected by events like this. I feel it's stronger for society to say "I don't know what will happen, but we'll work through it together."
It is easy wax philosophical when it doesn't affect you directly. There are folks honing on their VFX skills for close to a decade and they will be impacted in a significant way.
It's not waxing philosophical, it's concrete advice for how to handle negative emotions associated with uncertainty and instability. The way things are going, it's very likely I'll be affected directly by these developments at some point. When that time comes, I'm not sure what will be better advice than focusing on the problems that I'm personally able to solve and looking at the potential upsides of the situation.
I think the issue with “don’t worry about things you can’t control” is, in this tech forum, not as valid as you might think.
We are building technology, to suggest no agency is helpful in avoiding any feeling of responsibility or guilt — perhaps rendering your comment within the realms of waxing philosophical.
Who better to worry about this than the people of hacker news?
From a pure mental health standpoint, sure, it’s solid advice but I think it’s narrowed the context of the broader concern too much.
An alternative to learned helplessness of “nothing you can do” is to encourage technologists to do the opposite.
Instead of forgetting about it, trying to put it out of your mind, fight for the future you want. Join others in that effort. That’s the reason society has hope — not the people shrugging as people fall by the wayside.
Depression mediation by agency feels more positive, but I don’t have a lot of experience tbh. Just a view that we, technologists, shouldn’t abdicate responsibility nor encourage others to do so.
That culture, imo, is why a large section of tech workers, consumers and commentators see the industry in a bad light. They’re not wrong.
EDIT: to add, “what problems can I personally solve” also individualises society’s ability to shape itself for the better. “What problems can I personally get involved in solving”, “what communities are trying to solve problems I care about” is perhaps the message I’d advocate for.
I think the point is to start considering a back up plan and then...hakuna matata.
Cat's out of the bag. There is no legislation that will stop this. Not unless/until it has some obscene cost and AI gets locked down like nuclear weapons. But even then, it's just too simple to make these things now that the tech is known.
I sure don't know the answer but we just don't know what's coming next. Gonna have to wait and see.
Sure, I would include a broad set of things under "what you can control", including joining an organization, donating, voting, etc. The OP is excessively worrying about things they truly can't control, like the long-term political implications of emerging technologies.
HN is in the perfect position to wax philosophical; this behemoth is coming for tech too. I've started plotting out what will happen if I have to use my hands to make a living and I'd really rather not be doing that.
But reality is as reality is and nobody is owed a desk job. These are very exciting times with what type of society could be built with this tech, human inefficiencies are responsible for a lot of suffering that we can might be ab;e to stamp out soon.
As someone who has suffered immensely from anxiety disorders and worrying / anger my whole life, this comment is wisdom right here.
Not in my thirties and almost nothing I worried about has come true. I mean, tomorrow we might get wiped out by a runaway technological singularity, but I could've spent the last 30 years of my life worrying a lot less too..
I know that I should stop worrying as I have no control over what might happen but I can't stop worrying. What was the key for you that helped you let things go?
I went to cognitive behavior therapy, for me it was like someone opened up my mind and showed it to me on a screen, it was a mirror into my head. It was amazing how it felt like I could rewire my thought patterns over the course of a few months.
The main takeaway from it all however, was the mantra: _thoughts are not facts_.
If you can realize that your thoughts are not objective truths, you will be much better off in almost every aspect of your life, because after living this mantra for many years, putting it to the test constantly, I know it's solid.
Later on I read a lot of Buddhist philosophy which matched incredibly well with the therapy because a lot of Buddhist thinking and meditation practice is quite similar in it's approach. This sort of reinforced the validity of the CBT because I realized wise people have known about seeing things in an objective light for millennia, which was validating for me and helped me continue on the introspective path.
Basically, we're all hallucinating in one way or another, almost all of the time, and that is ok, just be aware of that. When we're worried about the future, we're worried about something which doesn't yet exist, which is actually crazy.
Of course it doesn't mean we should just ignore long term problems, no one advocates for that. But we shouldn't assume we know the outcome in advance because that often causes stress.
Warning: I think that for most westerners, it's "safer" to get into something like CBT, Buddhism comes with some IMO very confronting ideas for a lot of people where as CBT is much more user friendly for westerners.
The implications of technological advancement are always the same- if it can be used to replace people at a satisfactory level, it will. Appealing to stoicism is nice, but it's a bittersweet salve in this situation.
Honestly though, as if technological advancement has been overall worse for humans. Without it, we'd be fighting lions for food in the Savannah forever, that might be appealing to some, but I'd have prefer to have spears, fire, shelter, medcine etc.
Industrial scale technology might ruin us though, so you might have some point, mostly I'm referring to climate change which is for sure the greatest existential threat imaginable right now. However it seems technology might bail us out here too, nuclear and renewables.
I have no issue with technological advancement, it's obviously one of the pinnacles of human achievement- I have an issue with how those advancements are spread about and shared, especially shortly after large technological advancements happen.
We undoubtedly have reaped immense benefits from the industrial revolution for example- that doesn't mean I'd have any interest in living through it or that it was executed in a way that prioritized the people who lived during those times.
Open source stuff is great, and I support it and have contributed to projects myself, but people bandy it out as if it's a silver bullet and I have my reservations there. The issue goes way beyond technology itself, it's structural/sociological/cultural and that's not going to be fixed just because there are open source alternatives.
It doesn't stop that, but would you prefer a world where you're unemployed and locked out of the technology, or unemployed, and have access to the technology so you can learn and use it for free to maybe get back in the game?
there must be a fallacy name for 'more of a good thing is always a good thing' line of reasoning. almost every good out there is good in a certain range. outside of that range it becomes detrimental, possibly deadly. there is even a Swedish word for it, https://en.wikipedia.org/wiki/Lagom. a few examples.
material:
- water: too little => thirst, too much => drown
- heat: too little => freeze, too much => burn
- food: too little => starvation, too much => obesity
spiritual:
- courage: too little => cowardice, too much => foolhardiness
- diligence: too little => slothfulness, too much => workaholism
- respect: too little => disregard, too much => idolatry
I understand your sentiment entirely, but it's not what I said, I didn't say an abundance is everything we should strive for , I said that having more efficient systems is good.
> Realize that it's a choice to respond to things this way.
Why do people always say this / think that saying this is helpful? Try saying to someone with ADHD, "realize that you are choosing not to get your chores done today. You're choosing not to get out of bed on time. You're choosing to show behavior that your peers describe as 'lazy'. This will keep happening as long as you let it!"
So what if you have the ability to choose whether you are depressed or not? Not everyone got the same choice. Not everyone still has that choice.
I don't really expect another solution, but this always kind of bothers me when I see people saying everything is a choice.
With neurodivergence and mental disorders, what you see as "choice" can end up not being a choice at all.
At a physical level, we don't have control over anything, it's all just subatomic particles bumping into each other. That doesn't mean all perspectives are equally helpful for solving problems and functioning in the world. I mostly agree with your points, but where we might disagree is whether it's useful to have certain psychological categories or disorders become part of one's identity.
> where we might disagree is whether it's useful to have certain psychological categories or disorders become part of one's identity.
You might read my comment as trying to claim that my disorders define me and that because I have these disorders I can afford to give up on this stuff because 'it's hopeless'. Truth is I've been trying to get past this for damn near a decade at this point and it's not nearly as easy as you make it out to be, and that's why I say that I don't have the same choice you think I do.
I didn't even know I had ADHD until a year or so ago, I'd just routinely lose the ability to do the stuff I love and I'd have to go find something else to do instead. Depression would stem from all the things I knew I loved but that I could no longer motivate myself to do. In fact I was probably even worse off before I knew about this because I thought that I was just doing something wrong, not being controlled by an invisible menace that most other people don't even know exists
I don't mean to be hostile or to impose that it can't be as easy as you're describing. I just don't think that it's right to say it's always just a choice how you react.
I have tons of completely involuntary reactions caused by primarily trauma, but I can't control them. They do things like force me literally out of consciousness with overwhelming guilt and/or sadness. That's not a choice. I didn't choose that. That's completely autonomous!`
It is objectively a better survival strategy in a complex enough society, to focus on unfair advantages and let the society burn to the ground. The suckers are going to take care of it and eliminate themselves too, and in a sense there's nothing more important than improving your own short term self preservation. This is actually psychopathic, and also kind of psychopathic too.
>> What I posted is what I have personally found to be the most useful advice in overcoming self-destructive mental habits
I'm glad a one-time, one-line quip worked for you, but in my experience, positive mental habits are built over time, through support and continuous practice.
I apologize for over-responding, but let me attempt to be more clear:
If you are responding to people's problems with common one-liners, it can be interpreted as belittling someone. It could be interpreted as an attempt to over-simplify or attempt to make them feel they are "inferior" to see and solve their issues, when their issues are to them, much larger than a random one-line quip.
The OP was asking for advice dealing with negative emotions. I gave what I consider to be the best advice for dealing with negative emotions. Just because something is a "one-liner" doesn't mean it isn't also a deep truth about human psychology. If you interpret what I wrote as belittling them or trying to make them feel inferior, all I can say is I disagree with you, because I know what my motives were in responding.
This is excellent advice. I will also add that with change and uncertainty, it’s difficult for us to imagine how banal things can ultimately turn out to be.
For example, I’m getting text messages all day long from random politicians asking for money. If you told people 50 years ago that one day we’d be carrying devices where we could be pinged with unwanted solicitations all day and night, they might have imagined an asphyxiating nightmare. But in reality, it’s mainly a nuisance.
The point is that your brain makes all kinds of emotional predictions about the future, but they aren’t really very useful and if you’re experiencing depression or anxiety, I can guarantee they are biased predictions.
The tens of thousands of people working in entertainment building other people's visions can now be their own writers, actors, and directors. And they'll find their own fans.
Studios will go away. Disney will no longer control Star Wars, because your kids will make it instead. In fact, the very notion of IP is about to drive to zero.
And OpenAI won't own this. They won't even let you do "off book" things, and that's a no-go for art. Open source is going to own this space.
There are other companies with results just as mature. They just didn't time a press release to go head to head with Gemini.
Sorry, but this is a 5th-grade take everyone on tech-heavy forums loves.
Only some people can make Star Wars (the pinnacle of independent filmmaking if you read Lucas's biography). It has nothing to do with the tools.
IP in the arts is how artists get paid.
I can assure you that no one in the creative industry feels liberated by these tools. Do you realise that just because you are good at lighting, you don't want to be an actor and make a movie? No, you like to be good at lightning, work with others who are good at what you do, and create a great work of art together.
AI imagery only knows what exists. It's tough to make it do innovative technical effects and great new lightning. "oh my god, stock video sites are dead" Yes, exactly; stock, by definition, is commoditised.
What I see is the tens of thousands of people in troll factories producing content for 3/4 of the world population ready to believe whatever they see in the TV.
I think a more likely scenario is that people will be so used to it that a lot of people are going to have trouble believing that real things are real. Conspiracy theorists already suffer from this and it's going to get so much worse.
I think in the initial years there'll be some major incidents where a fake thing gets major attention for a few days until it's debunked, but the much larger issue will be the inverse.
> Be excited! The tens of thousands of people working in entertainment building other people's visions can now be their own writers, actors, and directors. And they'll find their own fans.
It's terrible news for the people being replaced. Their training and decades of experience is their competitive advantage and livelihood. When that experience becomes irrelevant because anyone can create similar quality work at the push of a button, they're suddenly left with nothing of value in a world flooded with competition.
Fully agree. I got a bit depressed Nov 22 when chatgpt and midjourney dropped… and then realized midjourney would let me create images I’ve had in my head for years but could never get out. (At least, MJ gave a reasonable approximation)
People should already be skeptical of everything they see/read on the internet. I don't think this is going to change my media consumption habits dramatically.
Dall-E was crazy and then suddenly people were doing the same thing on consumer hardware with an open model within a year.
Filmmakers being able to bring their vision to life using generative models is going to create such a huge expansion of the market.
What people don't realize is that long term these advances are a death knell for mega-corps, not for individuals.
Why do I need to kiss Weinstein's ass to get my movie made if I can do it with a shoestring budget and AI and have the same assistance to create marketing materials, etc. I need a lot less money to break even and can focus on niche markets aligned with my artistic vision instead of mass appeal to cover costs plus the middlemen involved in distribution and production.
Film/video editing isn't exactly known as the industry where everybody loves their job and doesn't want to kill themselves.
I made a twitter thread[1] with weird metal cybertrucks using Midjourney a couple days ago. I personally enjoyed the process and do not have the talent nor the time to do that without generative AI. There are people who do have that talent, but honestly I doubt anyone else would've put in the time.
I think you might have it a little backwards. For most people, the fun part is "making a movie", not "watching hundreds and hundreds of hours of footage picking between 10 different shots". That's the drudgery, and that's the part generative AI can eliminate.
I've made my living that way and absolutely loved it. What I did not love (and partly why I left the industry) was the difficulty of getting paid decently at the bottom tier; I had the bad timing to come in right as the bottom was beginning to fall out of the indie market and making straight-to-video b-movies 3 or 4 times a year ceased to be a viable business model.
I think you might have it a little backwards. For most people, the fun part is "making a movie", not "watching hundreds and hundreds of hours of footage picking between 10 different shots". That's the drudgery, and that's the part generative AI can eliminate.
No, that's the craft, and solving problems where the continuity doesn't line up, or production had to drop shots, or the story as shot and written sucks in some way, is where the art comes in.
The drudgery is things like ingesting all the material, sorting it into bins, lining up slate cues, dealing with timecode errors, rendering schedules, working your way through long lists of deliverables and so on. You have literally confused the logistics part with the creative act.
I have not confused it, I'm simplifying to make a point. Yes, of course there are many people who love the art of editing, or taking the right shot, or acting, or directing, or special effects, or all of the 100s of things that go into making a movie or TV show or other video.
But many of those things involve a lot of drudgery, and the drudgery is what these "AI" solutions are best at. If you want to go above and beyond and craft the perfect shot, that opportunity would still be available to you. Why would it not?
When we invented machines that make clothes, did that reduce the number of jobs in the clothing industry? When we got better and better at it, did that make fashion worse? No. If you want a machine made suit for $50, you can find one. If you want a handmade suit for $5000, you can find one.
Tech like this expands opportunities, it does not eliminate them. If and when it gets to the point where Sora is better at making videos than a human in every conceivable dimension, then we can have this discussion and bemoan our loss. But we're not even close to that point.
I don't buy this simplification claim; you literally described the core skillset as drudgery. Put another way, what parts of film editing do you not consider drudgery? Could it be that you tried it previously and just didn't really like it?
And with your suit example, you're looking at it from the point of view of consumer choice (which is great) without really looking at the question of of how people in the clothing/textile industry are affected. It's difficult to find longitudinal data at the global level, but we can look at the impact of previous innovations (from outsourcing to manufacturing technology) on the US clothing market; employment there has fallen by nearly 90% over 30 years: https://www.statista.com/statistics/242729/number-of-employe...
The usual response to observations like this is 'well who wants to work in the clothing industry, those people are now free to do other things, great opportunity for people in other parts of the world etc.', but the the constant drive to lower prices by cutting labor costs or quality has big negative externalities. Lots of people that used to make a living thanks to their skill with a sewing machine, at least in the US, are no longer able to monetize that and had to switch to something else; chances they were less skilled at that other thing (or they'd have been doing it instead) and so suffered an economic loss while that transition was forced upon them.
The "someone must have lost out economically" argument falls fairly hollow when you actually look at the stats and see that the vast, vast majority of people end up better economically when we develop technology and increase efficiency.
Luddism is never the answer.
Scratch that; luddism is the answer for people who don't actually care about humanity as a whole (but frequently pretend they do) and just want their hobby or their job or their neighborhood to stay the same and for everyone else to stop ruining things. But for the rest of the world, increasing technological efficiency means more people get more things for less. This is good actually.
This reduces filmmaking to only editing. Filmmakers won't be choosing between 10 different shots but instead between 10 different prompts and dozens of randomized outputs of those prompts, and then splicing them together to make the final output.
Prompts are just the starting point. Take image generation for example and the rise of ComfyUI and ControlNet, with complex node based workflows allowing for even more creative control. https://www.google.com/search?q=comfyui+workflows&tbm=isch
I see these AI models as lowering the barrier to entry, while giving more power to the users that choose to explore that direction.
All that amounts to just more complex ways of nudging the prompt, because that prompt is all an LLM can "comprehend." You still have no actual creative control, the black box is still doing everything. You didn't clear the barrier to entry, you just stole the valor of real artists.
So wrong. There are some great modern artists in the AI space now who are using the advanced AI tools to advance their craft.. look at eclectic method before AI and look at how he evolving artistically with AI
Shadiversity made the same class of attribution error. AI users aren't evolving artistically, the software they are using to simulate art is improving over time. They are not creators, they are consumers.
Photographers have a great deal of creative control. Put the same camera in your hands versus a professional and you will get different results even with the same subject. You taking a snapshot in the woods are not Ansel Adams, nor are you taking a selfie Annie Leibovitz. The skill and artistic intent of the human being using the tool matters.
Meanwhile with AI, given the same model and inputs - including a prompt which may include the names of specific artists "in the style of x" - one can reproduce mathematically equivalent results, regardless of the person using it. If one can perfectly replicate the work by simply replicating the tools, then the human using the tool adds nothing of unique personal value to the end result. Even if one were to concede that AI generated content were art, it still wouldn't be the art of the user, it would be the art of the model.
It takes different skills depending on how deep you want to go. Try setting up your own video creating lab using stable diffusion to generate frames. It can make AI videos, you also need to have a lot of Linux dev op skills and python skills..
I did in fact make the twitter thread. The images I used in said thread were generated using midjourney, which I stated here and in the thread (which I made, by tweeting).
I appreciate you being straight up about it. I wasn’t trying to be harsh, and I apologize for not being clear. I find the terminology used when using ai to create things interesting. “I wrote this using X” versus the never used “I instructed X to write this for me”.
Are you honestly comparing taking a photograph (and "properly", i.e. thinking about lighting and composition and such, versus firing off a snapshot on your phone) with typing "Make me a picture of Trump riding a dragon"?
Are you genuinely equating the profound and labor-intensive process of painting, with its meticulous brushstrokes, profound understanding of lighting, composition, and the tactile relationship between artist and canvas, to the trivial button pressing of photography?
Disclaimer: This post was generated using an llm guided by a human who couldn't be bothered explaining why you're wrong.
This comparison doesn't work because when you talk about photography, you say you need to do it "properly", but you don't apply that same logic to prompt crafting. Typing "Make me a picture of Trump riding a dragon" is not "proper" use of generative AI.
I became a software engineer because I enjoy coding. If you told me software will now be written by simply describing it to a computer, I would quit because that sounds like a fucking terrible way to spend your life. I assume that video editing and post production is the same: a creative problem that is enjoyable to solve in itself. When you remove any difficulty or real work from the equation, you probably get a lot of bad, meaningless content and displaced people without marketable skills
It's not that long ago in human history that basically none of the jobs we do now existed. So it is kind of myopic to think that any current career is a calling. Art can become a craft again, not a career. There is nothing wrong with that.
The issue is that those jobs that got automated to "become a craft again" have mostly vanished, except for high-end stuff. Some examples: shoe making, artisan furniture, tailors, watchmakers. Unless you are the best of the best these are hobbies now not something you make money from.
Nowadays most people make money in bleak half-automated jobs (e.g. construction, factory workers) or in white collar jobs sitting in front of a computer in some cubicle doing some mind numbing task for a megacorp.
I'm usually hyped about technological advancement, but very bleak about AI. I think it will just bring more sublte propaganda for state actors, more subtle advertising for megacorps, the dieing of creative jobs like graphic artists or actors is just a sad sideeffect (these will still exist, but only as high end -- we will always have real AAA actors, but the days of extras on movie sets are counted -- lots of the Hollywood protests were because studios started doing contracts for noname actors that stated that the studio will regain rights of the actor's digital likeness)
When is a time in history when everyone had really great jobs? Before the industrial revolution, you had most people doing subsistence farming. During the industrial revolution, you had 14 hour a day exploited laborers working in factories. Maybe there was a brief period after World War II where you had a large middle class with stable careers and affordable housing. That's not the norm for the millions of years of history of human evolution.
To me, this reflects a perfectionist mindset. Life is better today for billions of people than it has been at any other point in the history of the human species. If you consider it a "bug" that we don't live in some sort of utopia where everyone's dreams are fulfilled, maybe you need to change your expectations and view things in a larger historical perspective.
It is perfectly possible to see that we live in the best time humanity has ever lived in and be concerned that we’re are at risk of regressing. Especially with people claiming that any regression is simply not viewing things in a larger historical perspective.
Nope. People are concerned. There has been a million times when people recklessly and blindly did things without carefully examining the consequences leading to terrible results and human suffering. Some examples: DDT, Iraqi war, fast fashion, early usage of radioactive materials as medicine, asbestos, etc
> Some examples: shoe making, artisan furniture, tailors, watchmakers.
> Nowadays most people make money in bleak half-automated jobs (e.g. construction, factory workers) or in white collar jobs sitting in front of a computer in some cubicle doing some mind numbing task for a megacorp.
And all the while they enjoy abundance of shoes, furniture, clothes and watches with value/price ratio absurdly high by standards of most of human history.
Just wanna point out that making stuff is different from having stuff. Making your shoe is much different from buying a Nike from the store (and I don't make shoes ;) ).
The craft is an activity, kind of an art by itself. Many find it enjoyable.
It's a luxury journey that most people around the world simply can't afford. Modern world is a marvel because it feeds and clothes them. If they had to pay a market rate to the artisanal shoemaker, they would walk barefoot.
There's nothing "bleak" about building stuff with your hands. Many building trades workers like what they do. And they generally appreciate technology improvements because those tend to make the work safer and less physically demanding.
This sounds nice, but having worked with many artists in the past a lot of them do it because they're good at it, it's enjoyable enough, and it pays their bills so they can eat.
Telling them, "You're now free to make the art you really wanted to make!" doesn't bring much comfort when you're taking away their ability to put food on the table.
Exactly, there are lot of arm chair experts in the forum today who have no clue about the reality of the industry, people do it because they are passionate about it and devote thier whole life to get good at it, this is just taking food from thier mouth.
It takes a lot of time to develop that craft, which won't be available to you if you have to do drudgery to keep a roof over your head. You're arguing for art to be at best a hobby, and full-time pursuit of it to be limited to rich kids.
Also I take issue with your argument about 'none of the jobs we do now' existing through most of history. Farming, construction, fighting, bookkeeping, cooking, transport, security are all jobs that have been around as long as people have lived in settlements.
Sure, you could point to the long history of nomadic hunting and gathering prior to that, but that's like expanding your argument back to the origin of cellular life or forward to the heat death of the universe in order to make your interlocutor's arguments look insignificant on a cosmic scale. It's not a helpful contribution to addressing the real challenges of the present.
There’s also loads artists that do web and graphic design, make videos for product demos, ad campaigns, and so on. It’s perhaps not the purest form of art, but it is one way in which artists can apply their craft and still put a roof over their heads. A lot of these AI tools seems squarely aimed at eliminating those positions.
For what it’s worth, I think we’re going to see a slide in quality. Maybe there will be a niche for some. But, I think companies will settle for 70% quality if it means eliminating 100% of a full-time position.
How long of a time are we talking about here? It was a lot easier to make a modest but steady living in the arts 30 or even 15 years ago. It's probably easier to have a breakout hit today on YouTube or Tiktok and maybe make a lot of money fast, but not to making a living consistently without sweatshopping content or being extremely personally attractive or similar.
Also not that long ago electricity and clean drinking water weren't a thing. The fact that people can make a career as an artist now, and couldn't before, is something I'd consider an advancement! "Nothing wrong with that" is a conclusion that simply doesn't follow from the rest of your post.
Yes, much in the same way that hiring someone to cater a dinner party makes me a great chef.
(edit to give some body to my comment above:
Hosting a great dinner party is hard work and requires coordination between food, decor, seasonality, people attending, etc. It is akin to a director coordinating the parts of a film. So I do think hosting a good dinner party can count as artistic expression.
I don't know the parent comment's intended reading, but I was reacting to the idea that typing a Sora prompt makes someone a good artist. If the parent means instead that AI allows people to coordinate multiple media in a broader expression that was not possible otherwise, then I fully agree.
So everyone at your dinner party gets to eat "better" food? Unless the point of the party was for you to cook then it's an improved experience.
GenAI is a tool that lets creators of one medium expand to other mediums without much effort. Like having transcripts auto-generated for a visual podcast, just in the other direction. Low budget (or amateur) poems/songs can turn into short videos; or replace generic album art with better quality generic album art.
The draw will be the primary medium, the rest will just be an extra bonus.
It's the samr discussion we had long ago when digital cameras cane abouut and image editing became easy and commonplace. Yes, there is a lot of badly edited stuff around now. For example, most meme images on social media are made by putting new captions on old content, and maybe changing a few details about the rest of the image. No, photographers didn't become obsolete. They professionalized.
When cameras were harder to use, you had national geographic taking you all over the world to photograph different locations because only you and a handful of people knew how to take a picture properly.
Now you just hire a local person with a camera to go take the picture you want since it’s much easier to use a camera.
You had people doing photography for ads, now stock photography will do for most brands.
You had people buy high school portraits, I am not sure if people buy those anymore, but a picture of what you looked like in high school is worth a lot less when you can take a selfie every other day.
I'd argue that it allowed people that lacked the creative confidence to create original art, to now have the confidence to make generic art. I don't mean this in a deeply negative way. I just think that people's view of "good art" is so narrow.
AI allows mediocre people to make an endless stream of mediocre, dull, aseptic, sterile content. FTFY.
I don't see it as a negative per se, the thing is most people won't have the decency to keep all that shit for themselves and, say, share just the best 1% they produce. They will flood their social networks and the rest of us will have to sweep through the crap for our daily dose of internet memes.
My point is that I am an aspiring artist, who is waiting tables and I invest all my spare cash and time to get better at my craft and hopefully allow my craft to support me financially.
Any hope of financial benefit coming from my craft is quickly taken away by Dall-e. This has nothing to do with how much I enjoy my art.
There will always be things humans can do that AI cannot. And if there ever comes a point when that's not the case, there will be no need to distinguish the two.
What does “better” mean in this context? The camera was a better at capturing realism than any painter who ever lived. While we still have people who paint in that style, there aren’t nearly as many, and art took new shapes and forms.
Most “good” art isn’t just what you see, it’s also the story behind it. Why was it made? What is the story of the artist? What does it make you feel?
AI might allow more people to tell some of those stories they may have lacked the raw skills to tell before. And for those who have the skills, they can make exactly what they envision, without being limited by some of the randomness in the AI. I think there will always be a place for that, and at the top of the market, that’s what people want.
This use case is a direct threat to actors when AI can create realistic footage with human and non human subjects, add to this generated speech, you have totally replaced hiring actors and killed thier employability.
Sorry, that's like claiming that the cinema has killed the theater, or that computer games have killed movies. Or that photorealistic 3D games have killed 2D slider games.
Blockbuster movies depend to a large extent on the pedigree and abilities of their cast. For the big studios, these models are therefore quite useless apart from bringing dead actors alive again. If publishing material created from living actors without isn't illegal already, in a few years it will be.
This might actually save the movie industry and force it to improve the quality of its output. There will be a huge indy scene of movie makers using models that can only compete via the content of the movies they produce. The realism of the characters won't matter because everyone can have those now. The current big studios will be forced to make very good use of human actors to compete with them though, and become innovative again.
Your analogy doesn't quite make sense. The reality of the TV/Film production is that most of what we watch are created by big production houses and not indie creators. These companies will do whatever it takes to reduce thier costs, biggest of that is salary for hundreds of staff that they currently employ.
Now with such AI tools, you can write scripts, create art work, crate footage, record voice overs and dialogues. All of this means less need for labor - creative that will not only cause huge employment in the sector but also lead to protests, it already happened last year in Hollywood, it's going to get louder and louder unless we put regulations to prevent job disruptions.
But those tools can also be used by the very employees that got laid off. They be would become part of the indie scene. The film studios will be left with their trademark portfolio that will be milked for profit. We might see an Avenger movie every month. There will be an absolute glut of such productions, to the point that people might not be interested in it anymore. Can't tell what happens next. We might lose ourselves in the holodeck, or we might again appreciate media produced with more human touch.
I like going to theater or opera. Even for famous pieces, the performance will be slightly different and unique every time. Imperfect, but with changing and nevertheless accomplished actors, singers, musicians, and dancers. Many people feel the same and that's why they watch live performances of singers, bands, and DJs.
Most likely market will be consolidated by existing popular and prescient actors who will add IP protections to their AI likeness and benefit from it. Especially after a certain age
>We are giving the enjoyable parts of life to a computer. And we are left with the drudgery.
Yesterday I asked a local llm to write a python script to have a several multimodal llms rank 50,000 images generated by a stable diffusion model. I then used those images to train a new checkpoint for the model and can now repeat the process ad infinitum.
In the olden days of 2020 I would have had to hire 5000 people each working for a day to do the same.
> We are giving the enjoyable parts of life to a computer
Are we though? People still do plenty of things out of interest or hobby, despite it being fully automatable?
e.g. blacksmithing or making certain homemade things?
While these are non digital things, why can't we apply the same thing here?
Some people still hand write assembly out of the novelety and interest of it. Despite there being better tools or arguably better ways of writing code.
There is no moat. This will all be commonplace for everyone soon, including with a rich open source community.
OpenAI won't let you do nudity or pop culture, but you can bet your uncle that models better than "Sora" will be doing this in just a few months.
> We are giving the enjoyable parts of life to a computer. And we are left with the drudgery.
No. This means that the tens of thousands of people working in entertainment building other people's visions can now be their own writers, actors, and directors.
This is a collapse of the Hollywood studio system and the beginnings of a Cambrian explosion of individual creators.
Because there will be millions of other people all making AI movies. And you'll be competing for attention with all of them in an attention lottery with seven or eight (nine?) figure odds against you.
The only original creativity will be in creating new formats and new kinds of experiences - which will mostly mean inventing new kinds of AI.
Everything made in an existing format will either be worthless or near as.
Same applies to software dev. Far more quickly than most people expect, it will also apply to AI dev.
Couple of years more, and everyone* will be able to generate these things for themselves, with the only requirement being knowing what to generate next. We might get very bored, and we might decide to appreciate partly or fully human-made things again.
*: except the ones starved for compute resources of course
Will there even be "blockbuster" video games, movies, books, etc? If hours after release, there are hundreds of lookalike clones, will there be "hits" like we know of today? We see this in the App Store today. It is just hard for me to see that part-time product being a big success, when at the first whiff of an interesting idea it will get repackaged, probably into something more effective.
"have scientific journals disappeared" -- ironically, in the AI field most of the action is on arxiv / github / twitter. journals have been obsolete for decades, and the '10s obsoleted conferences too. the only function journals / conferences still serve in the AI field is to stack rank researchers and provide signal for hiring / funding decisions.
Yours is an optimistic take and frankly I do agree with most of it: there isn't an upper bound to economic opportunities as long as everyone gets to use the tools, since the cost/risk to produce something new will significantly decrease, this will boost the diversity of creative industries from which countless gems will be made. However the problem is what if everyone loses their current opportunities before these techs become widely available? How to handle the transition period? A monopoly/oligopoly is not going to care about helping the average person in accomplishing that transition, because it won't make their next quarterly earnings report look pretty.
I don’t think this plays out this way in reality. Look at music streaming. Record labels are still important to making or breaking music careers even in the age where any artist is discoverable, there is no ‘switching cost’ and there is 0 cost of production and distribution (making and shipping CDs).
In a world where attention is scare I sadly think big corps and power brokers will still play a large role. Maybe not though.
Filmmakers being able to bring their vision to life using generative models is going to create such a huge expansion of the market.
Of what market? Certainly not film production. I have my doubts about whether it will expand the market for films, in the economic sense. The lower the cost of producing and distributing a film, the lower the monetary value people will place on it.
Look how most music artists are no longer able to survive on royalties, and a few massive streaming companies have an astounding profitable oligopoly of consumers' music interest. Yes, many pre-streaming publishers were exploitative or unethical, but I'm not convinced that it was to a greater degree than the current market leaders. Consider also that the streaming revolution steamrolled many, perhaps most, indie record labels that supported niche genres; some live on but are no longer able to sustain physical output and a reduced to being digital marketing companies.
Now, people will continue to tell stories and entertain others, so technology like this will be good for people with an artistic vision who can't easily access publishers for whatever reason. It will certainly allow people to pursue bold artistic visions that would not otherwise be economically feasible - exotic locations, spectacular special effects, technically complex perspective moves. Those are good things; I worked in the film industry for a long time and have several unproduced scripts that I'd like to apply this technology to, so I'm not rejecting it.
However, more content doesn't necessarily translate into more economic activity; I think it very likely that visual media will be further devalued as a result. People who have spent years or a lifetime developing genuine craft will be told to abandon it in favor of giving suggestions to a computer system, and those who don't will be laughed at or suspected of fakery, because fakery is so widespread these days (Relevant recent example: https://news.ycombinator.com/item?id=39379073). The easier it becomes to make something, the less value the market will assign to it; rational from the abstracted perspective of pure price theory, disastrous in real life.
Increasingly, we seem to be tilting towards a Huxley-esque dystopia of stunning and infinite-feeling virtual worlds to which we can escape on demand, and an increasingly shitty real world marked by the brutal economic logic of total resource and information exploitation. Already a stock rejoinder to complaints about the state of things is that humanity is on paper richer than ever before, to the point that bums have smartphones and anyone can afford an xbox. I have a homeless neighbor who's living in his car, spending his dying years watching YouTube on his phone to fall asleep because he's lonely. Technically this is an expansion of the market, but I don't think it's a good outcome.
Mega-corps exist by dividing work into tasks that can be procedurally performed by minimally skilled laborers, then keeping the delta between cost of time and value of the product. Was it Karl Marx who argued this first?
AI turns skilled labor into cheap ones, supposedly, right? It's a massive enabler for mega-corps. Not a death knell.
I am not sure we’re having fewer megacorps as technological progress marches on. We probably have bigger companies with broader influence now than 50 or 100 years ago, right?
I seem to be immune to it now. I’ve just accepted that I’m going to feel less and less useful as time goes by, and I should just enjoy whatever I can. Life will probably never be as good as it was for people 30 years older than me, but it’s not something that looks likely to change.
Nothing about the future looks particularly good, other than that medicine is improving. But what’s the point of being alive in such a sanitised, ‘perfect’, instant-dopamine-hits-on-demand kind of world anyway?
Just say to hell with it and bury yourself in an interesting textbook. Learn something that inspires you. It doesn’t matter if ‘AI’ can (or soon will be able to) do it a billion times better than you.
1) My wife trained as a typesetter on a photo typesetting machine. That was already replacing typesetters working with lead, and the people sorting the used lead, and working with inks etc. They still needed a past-up artist and more. Eventually the GUI based computer arrived, with PageMaker, Quark, Indesign etc. These days she is super productive with a massive online icon library available, full printing and distribution capabilities. Able to do a job that could have involved half a dozen or more people previously.
Are those people unemployed now? Not really (we are talking 1.5 generation now, so not the same people). Unemployment levels are low, and the workforce is significantly larger, with men and women working. The working (outside of the home) part of the population has gone up significantly over a few generations, despite all the new enabling productive tech.
What I see is a lot of visually higher quality work being delivered, but often with the same core content. So productivity has increased, but you get a glossy new shiny report in a PDF, instead of a photocopy of a typewritten page. (Yes, I do simplify. But I think you get the gist.)
2) I started as a system admin, the a systems analyst, worked through project manager, etc. until I was leading startups. In the space where I work now, circular economy and food production, there is so much work to do, that any AI support we can get is welcome. But as the work is innovative, new and not done before, most often the AI tools aren’t that useful, yet. That may change, but with a society that needs to replace a significant part of the infrastructure and processes to achieve a long term sustainable society, I actually don’t worry the AI tools will take my job or any of my colleagues job away. There is plenty to do. I have enough new things in front of me that I could probably keep a whole big venture fund occupied for a long time.
This reminds me of some claims I've heard about domestic cooking and baking. Supposedly recipes got more sophisticated as we developed machinery like blenders and more pre-processed ingredients to make work quicker which ultimately resulted in time to cook or bake when hosting guests to take roughly the same amount of time as before. The dishes were just more elaborate.
Same with finance, we could all live an upper-middle-class life with all the luxury, for one parent working in the home, if we were willing to live the same lifestyle as the 1950s. But life today is much easier than even then — and we’d rather pay more for that extra luxury than live the spartan lifestyle that would’ve been luxury then.
I think that's a bold claim. Many things that were cheap then are now unreachably expensive. Many careers that paid well then are gone. The converse is also true; many things that were unreachably expensive then are cheap now, but that doesn't mean that we can easily live such lifestyles if we choose to. Forgoing a cellphone and seatbelts doesn't make it easy to afford a bungalow! Nor is such a lifestyle even legal, in many cases. And you won't find a payphone to call your family. The world moves around us, and it's not a matter of choice to be pulled along.
This is just flat out untrue for a variety of reasons. People need to stop thinking of 1945-1975 as "the norm," it was a world historic anomaly that was a direct response to earlier events- the economy targeted full and fair employment to stop people from drifting to more extremist ideologies because everyone literally just lived through the result of highly unregulated capitalism for example. That was a large impetus behind Keynesianism and Bretton Woods, which was ultimately unsustainable and led to the broader global economy we have now- which itself seems more unsustainable by the day.
- 30% of food produced is gone in losses and waste
- People in middle to high income countries have significant obesity and related problems (some countries have either malnourished people or obese people, and less in between)
- Pollution from our agriculture and aquaculture is killing the ocean near land
- Our intensive agriculture is threatening biodiversity
- We have lost up to 70% of insects in many industrial nations
- Essentially all (90%?) of ocean fish stocks are overfished or at capacity. Even a supposedly rational and environmentally aware nation like Sweden can stop the overfishing. The cod stock has collapsed and now the herring is going too
- The soils are being destroyed or depleted
- The phosphor and nitrogen cycle is broken (fossil fuels or resources that are mismanaged)
I could go on. It is well documented.
I work on circular food production, where we really care about putting together highly efficient nutrient loops and make sure they work locally/regionally. A mix of tech (automation, IT, climate control, etc), agriculture, horticulture, aquaculture, insects etc. As part of this there are very interesting completing pieces with ways of getting the nutrients in creative ways (new food tech), dealing with animal disease (new tech), combined with sensors, ML and just plain old common sense, that can make a huge impact. If we just think through the process a bit more and take responsibility for the externalities, which really are starting to bite.
Much is still overhyped in foodtech imho, specifically the stuff which claims silver bullets without proper circularity. Which is detrimental to the real solutions as investors like simple superscalable solutions, and the simple solutions are mostly not sustainable. (There are of course exceptions).
The parent commenter is saying that the moment automation entirely replaced traditional typesetting, people moved on and started using the new technology.
Sure a part of the population is slow to adapt and therefore at a disadvantage. But the others, like his wife, adapted.
The idea is that this wave of automation will be no different than other times this has happened to us in the past.
The difference could be that A(G)I will automate away the would-have-been new jobs as well, instantly, as it will function as a smarter human that needs no sleep and demands no pay, sort of like a young programmer but without capital owners even having to supply caffeinated beverages.
A lot of decisions are not based on intelligence alone. A lot is about personal beliefs and tastes.
I've never totally understood this binary moment when AGI does "everything" better. How can one even define everything?
Our AI partner could be the most intelligent mathematician or researcher. That's great then we can bounce ideas off of them and they can help us realize our professional / creative ambitions.
Sure if our goal is to maximize profit then maybe we can outsource the decisions to an AI agent.
You can get a computer to create infinite remixes of songs. I haven't seen that replacing music producers doing the same.
The I in AGI doesn't really mean intelligence as in "a mathematician has to be more intelligent than a janitor to do his job" *, it means the productivity equivalent of whatever human conciousness is; that is, to be able to have beliefs and tastes as well. And since it will have infinite patience, arguments such as "I prefer to have an actual human musician playing" is also up for persuasion.
When everyone can just press a button and have better music automatically generated, based on their exact preferences inferred from their DNA or an fMRI brain scan, what are your creative ambitions?
I'm obviously not talking about today's limited (public) AI, but far into the future, like in 5 years.
* whether or not that is actually the case is irrelevant
So much about music appreciation is about knowing the artist and for example knowing that what they sing about is shaped by their personal history allowing you to identify with it.
A lot about the appreciation of art is the process and the intention behind the artwork. The final image or sond is just one part.
I find it very difficult to define "better" when we're talking about art.
That isn't the OP's point I believe. I think the point was if the more productive means of production is ultra-centralized to a few owners of AI, the question wouldn't be whether to go outside, but whether you can afford to not be permanently outside, if the superstructure of society assigns housing to capital and not humans.
> if the more productive means of production is ultra-centralized to a few owners of AI
But AI is different than previous waves, like search engines and social networks. You can download a model on a stick. You can run it on a CPU or GPU, even a phone. These models are easy to work with, directly in natural language, easy to fine-tune, faster, cheaper, and private under your control. AI is a decentralizing technology, will empower everyone directly, it's like open source and Linux in that it puts users in control.
And that sand takes a very, very long time with lots of big brains to figure out how to manipulate at the nanometer level in order to give you a "beep boop"
It's not like Intel could decide tomorrow to spin up a fab and immediately make NVIDIA and TSMC irrelevant. They're the next closest thing given they make chips, have GPU technology, and also foundry experience and it's still multiple years of effort if they chose that direction.
Your statement is a lot like saying "poker has predictable odds" and yet there is still a vast ocean of poker players.
Anyone can make a cotton gin.. Industrialization of an industry basically centralizes its profits on a relatively small number of winners who have some advantage of lead time on some important factors as it becomes not worthwhile for the vast majority of participants from when it required more of the population.
One can't really enjoy life much if you don't have financial means to survive. This technology promises to wipe of hundred's of thousands of jobs in media production - from videographers, actors, animators, designers, camera person working in TV, Movie production all are one click away from losing job.
i thought the point of tv was to sit back and be entertained, usually through some form of storytelling. personally, i don't want to have any part in the creation. if anything custom content would be annoying, because i'd lose the only social aspect of tv (discussing with others)
"I’ve just accepted that I’m going to feel less and less useful as time goes by"
It's probably the same feeling farmers had in the beginning of the 20th century when they started seeing industrialized farming technologies (tractors, etc). Sure, farming tech eliminated tons of farming jobs, but they have been replaced by other types of jobs in the cities.
It's the same thing with AI. Some will lose their jobs, but only to find different types of jobs that AI can't do.
Sorry, but comparing this to previous technology seems totally short-sighted to me (and it’s not as though you’re the first to do so). If (if) we end up with truly general AI (and at the moment we seem to be close in some ways and still very far off in others), then that will be fundamentally different from any technology that has come before.
> jobs that AI can't do.
Sure, by definition, you’ve described the set of jobs that won’t be replaced by AI. But naming a few would be a lot more useful of a comment. It’s not impossible to imagine that that set might shrink to being pretty much empty within the next ten years.
> It’s not impossible to imagine that that set might shrink to being pretty much empty within the next ten years.
No but it’s also not impossible to imagine the opposite. AI beat humans at chess decades ago but there are more humans generating income from chess today than there were before Deep Blue.
No one pays anyone to play chess because it’s useful.
Chess players get paid because it’s entertaining for others to watch.
So your argument only shows that we can expect work as a form of entertainment to survive. Outside of YouTube, where programmers and musicians and such can make a living by streaming their work live, this is a minuscule minority.
The strongest interpretation of what you’re saying seems to be that we’ll end up in a world where everything (science, engineering, writing, design) is a sport and none of it really matters because ultimately it’s ‘just a game’. Maybe so… but is that really something to look forward to?
They get paid because the people who can't play chess professionally watch it as a mental escape from their drudgery jobs because it reminds them of their youth when they could still dream about becoming a great chess player, and then you can use marketing displayed during the chess tournament to trick them into preferring to spend the money they make from the drudgery on the adveritser's product.
Now upgrade AI to do every job better than humans so that there are no drudgery jobs. What money are they going to spend?
Not too long ago, people would come and visit the first family in the village who had installed running water, because it was a new and exciting thing to see. And yet people don't wake up every day excited to see water coming from their kitchen tap.
Think more broadly than that single example. Perhaps humans will always be interested in economic activity that involves interacting with other humans, regardless of what the robots can do.
My intuition tells me humans will always have needs that AI can't fulfill. If AI does more and more jobs, cheaper, faster, and better than humans, then the price of these services and goods are going to drop, and that means people will have more disposable income to spend on other services and goods that are more expensive because AI can't produce those (yet).
Imagine a breakthrough not only in AI but also robotics, allowing restaurants to replace the entire staff (chefs, cooks, waiters, etc) with AI-powered robots. Then I believe that higher-end restaurants will STILL be employing humans, as it will be perceived as more expensive, more sophisticated, therefore worth a premium price. What if robot cooks cook better and faster than human cooks? Then higher-end restaurants will probably have human cooks supervising robot cooks to correct their occasional errors, thereby still providing a service superior to cheaper restaurants using robot cooks only.
I agree but also think this discussion need to go deeper into its assumptions. They can't really hold in a world with AGI. Can anyone acquire/own AGI? Why? Why not? Will anyone pay anyone for anything? Will capital, material and real estate be the only things with steep price tags? What would a computer cost if all work was done by AI?
>It's probably the same feeling farmers had in the beginning of the 20th century
Not remotely comparable. Farming is a backbreaking job, many were happy to see it going away. This is taking over the creative functions. Turns out what Humanity is best at, is menial labor?
Well, replacing novel creative functions with derivative creative functions. That's the big change I see here; similar to the difference between digitally editing an image vs. applying a stock sepia filter to it. Yes, we can use a model to regurgitate a mish mash of the data it was trained on, and that regurgitation might be novel in that nothing like that has been regurgitated before, but it will still be a regurgitation of pre-existing art. To some degree humans do this too, but the constraints are infinitely different.
Humanity will not be best at anything. Even menial labor will be automated.
So the downside is we have lives devoid of meaning. The upside is we live in a scarcity free paradise where all diseases have been cured by superhuman ai and we can all live doing whatever we want.
> but they have been replaced by other types of jobs in the cities.
But when one is 30+ years old, or even 40+ years old, it's hard to completely switch careers, especially when you're also dealing with the fact that it's not because you were bad at your job. Rather, a machine was made to replace you and you simply can't compete with a machine.
It's evolution, of course, but it is a stressful process.
I see this "just adapt" response a lot and it misses the point. The goal of research like this is to create a machine that can do any job better than humans.
That's been the prediction with many technological updates, but here we are. This setup works just fine for the small group of fantastically wealthy and powerful people that dictate society's requirements for the rest of us.
I can't imagine anything changing our culture's insistence that personal responsibility in employment means zero responsibility for employers, policy makers, or society at large. That is, short of a large scale armed rebellion, or maybe mass unionization.
With all of the great AI-driven public opinion influencing tools? I can't imagine they'd need to TBH. To be clear, I think the likelihood of an armed rebellion is zero, and a successful one would be less than zero. While mass unionization may be more likely, as soon as it starts to significantly impact the top's bottom line across the board, we'd see a bunch of laws that cripple unions.
History is full of examples of people with power deciding that they like resources but don't need the people who live on top of them, and solving this problem by going on a killing spree.
Bear in mind that a substantial portion of people (perhaps 30%) don't feel satisfied unless they see someone else worse off. We are not an inherently egalitarian species.
"People" find different jobs but individuals don't. Many people displaced by technology don't recover even despite retraining programs and go work in service industry or go into early retirement. The new jobs go to a younger generation.
I've started reading again, because reddit/instagram/etc. has become kind of boring for me? Like, I still go on them to get an instant dopamine hit from time to time, but like you said burying yourself in a textbook just feels so much more rewarding.
I've abandoned all online content sources except HN, Substack, and YouTube. The latter two are aggressively filtered and still feel like they're getting less interesting over time. HN isn't the best habit, either, but it's good to have at least one source of news.
Maybe someone needs to start a small group of people who specifically want to do this — seek refuge from the chaotic and increasingly worrying world (in particular the threat of replacement by extremely general automated systems) by immersing themselves in learning, and sharing the results with others.
I’m sure such groups already exist, but maybe not specifically with this goal in mind.
Learning for its own sake really is the answer to lasting happiness… for some of us, anyway.
> seek refuge from the chaotic and increasingly worrying world (in particular the threat of replacement by extremely general automated systems) by immersing themselves in learning, and sharing the results with others.
I think his point is that people have felt like the world is going to shit for a very long time, it's just that with the presence of hindsight we can see that in the past everything worked out, but we can't see the future so our present is troubling.
But none of these feelings are new, just different problems manifesting the same.
In any group there are people that are more talented, more persuasive and/or have more initiative than others. These people will naturally become the group's leaders. This can only be avoided in groups which don't have to make on decisions or conduct activities.
Hey, Euclid's ideas from 2000+ years ago are still going strong.
I doubt much of what we know today will turn out to be wrong. Maybe our abstractions will turn out to have been naive or suboptimal, but at least they're demonstrably predictive. They're not just quackery or mysticism.
Well… in subjects like mathematics they kind of do, don’t they? There’s not much room for opinion on what’s true and what isn’t. Of course, how something is done or the language used to describe it is always up for debate.
You did say ‘wrong’, though, not ‘considered wrong’.
There are no negatieve quanta and there are no negatieve qualities. It would be hilarious to suggest there would be products of the two.
You have 3 baskets with 5 apples each, you remove 7 apples from each basket, remove 5 baskets and you have -2 baskets with -2 apples each thus therefore you have 4 apples left all without the involvement of trees, like Jezus!
Not really. Universities barely even pretend to be ‘churches of learning’ — at least anymore. Going to university, for the vast majority of students these days, is more an exercise in CV-building than self-development and learning.
Such groups are by definition reclusive, hard to find on social media, and might be a lot more fringe or "weird" than you'd prefer. For a while, subreddits were a bit like this.
I don't think it's too far fetched to hypothesize that the next major global conflict will be between accelerators (e/acc) and decelerators. I see a parallel with political/economic ideologies like capitalism and communism. One of them will eventually prevail (for most of the world) but it won't be clear which until it happens. Scary but also exciting times ahead!
Is this a joke? Go outside. Go hiking. Make a garden. Visit Yosemite. Take up bouldering. Learn to surf. Cycle. Go camping. There's a world of living and massive communities but around real life. Explore what your body and mind can do together. Find kinship because it's out there in spades for people not obsessed with the automation of machined content.
> Life will probably never be as good as it was for people 30 years older than me
> Nothing about the future looks particularly good, other than that medicine is improving.
How do you reconcile your thoughts with what the CEOs of these AI companies keep telling us? I.e. "the present is the most amazing time to be alive", and "the future will be unimaginably better". I'm paraphrasing, but it's the gist of what Sam Altman recently said at the World Government Summit[1].
Are these people visionaries of some idealistic future that these technologies will bring us, or are they blinded by their own greed and power and driving humanity towards a future they can control? Something else?
FWIW I share your thoughts and feelings, but at the same time have a pinch of cautious optimism that things might indeed be better overall. Sure, bad actors that use technology for malicious purposes will continue to exist, but there is potential for this technology to open new advancements in all areas of science, which could improve all our lives in ways we can't imagine yet.
I guess I'm more excited about the possibilities and seeing how all this unfolds than pessimistic, although that is still a strong feeling.
> How do you reconcile your thoughts with what the CEOs of these AI companies keep telling us? I.e. "the present is the most amazing time to be alive", and "the future will be unimaginably better". I'm paraphrasing, but it's the gist of what Sam Altman recently said at the World Government Summit[1].
Three ways:
* It's the job of CEOs to advocate the benefits of what they're doing.
* Those things might be true, for them.
* Those things might be true, from a global perspective, even if there are some people who are worse off. White-collar workers might just be those people worse off.
If you think that's bad, try being 18 :) this field may not exist (at least in its present form) by the time I'm out of uni (I'm planning to hedge my bets by studying physics), and it seems the world is getting less stable by the minute. There seems to be no sense of urgency or even medium-term thinking in stopping Putin, and Article 5 appears to be becoming less sacrosanct by the minute. Society is increasingly divided, with absolutely no attempt to find common ground (particularly evident in my demographic) and the majority of my generation having a miniscule (and shrinking) attention span through their direct stream of Chinese propaganda. And, of course, the climate-shaped elephant in the room.
I'm just trying to not let it get in the way of appreciating the world. I'm planning to travel to mainland Europe sometime next year (gap year). SpaceX has reignited spaceflight, and there's so much cool stuff going on in that space. Science marches on, with a steady stream of interesting discoveries.
And programming is great - for now. It feels slightly strange spending a week writing a project that may be finished with a single prompt in a few year's time, but it's enjoyable.
Maybe I'm overreacting? I've grown up in a pretty calm period, with the west in a clearly dominant position. Maybe this is, paradoxically, a return to normality?
Cheer up. It's not real. Generative AI is going to force us to confront what makes us human and the real world real, and learn to love it all over again. Sure, a lot of people will be lost in digital realms. Some might even like it. But I think that many will embrace the messy, imperfect, poignant realm we live in.
I’m the complete opposite, I wish I was being born 20 years in future. I am kinda terrified of being 80 when they come out with some technique for heavily slowing down aging and our generation just has to sigh and accept we just missed the cutoff.
I'll just say: I have Type 1 diabetes, and in my lifetime, we have invented
- fast acting analog insulins that are metabolized in 2-3 hours instead of 6-7
- insulin pumps that automatically dose exactly the right proportion of insulin
- continuous glucose monitoring system that lets you see your BG update in real time (before, it was finger sticks 4-5 times a day; before that, urine test strips where you pee on a stick to get a 6 hours delayed reading (!))
- automated dosing algorithms that can automatically correct BG to bring it into range
In aggregate, these amount to what is closer than not to a functional cure for type 1 diabetes. 100 years ago, this was a fatal condition.
You are partially correct. Although notice that diabetes, both type I and II have dramatically increased due to a direct result of bad advice and environment. A little like giving a deaf person a hearing aid, while not addressing factors like loud noises that may lead to hearing loss.
>Medical understanding is not getting worse, unless I’m severely mistaken.
You are mistaken. To realize that you will have to look back several decades and read the literature of those times, of what is left. Now note I'm taking about chronic illnesses (diabetes, cancer etc) not acute ones like an infection etc. The medical practitioner of yesteryear did not have the fancy diagnostic tools that we have today, but several of them appear to me to be sharp observers.
You're exactly right, but most people just believe the headlines about cancer cures and "individualized medicine" that pop up every week and don't realize that literally none of them produce anything that helps real life patients. Medicine is not getting better - it's getting more expensive and less efficient.
I dunno, I can casually get an MRI to check the status of slime in my nose these days. It may not be strictly ‘better’ but the availability certainly goes up.
A majority of what you wrote is objectively false FUD. The only thing that I found accurate is:
> it's getting more expensive and less efficient
There have been a ridiculous number of medical advances in the last few years, advances that are actively improving and saving lives as I write this. Remember that time we had a pandemic, and quickly designed and produced a massive number of vaccines? Saved millions of lives, kept hundreds of millions from being bed ridden for weeks? The medical technology to design those vaccines, and to produce them at that speed and scale didn't exist 20 years ago. Cancer treatments, which you specifically mentioned, are entirely better than they were 10 years ago.
The actual issue, which is the only worthwhile thing you wrote about, is cost and availability.
You are simply ignoring what I actually said. My criticism was directed at specific fields: name one cancer treatment or "individualized medicine" approach that has been proven to save lives or increase quality of life in the last 3-5 years. I'll wait.
The vaccines were not the result of medicine getting "better" - they just happened to have a solution for the right thing at the right time, which is fortunate (and we're lucky that it worked, because there was no guarantee of that beforehand) but if the pandemic hadn't happened, what advances would we be discussing? What advances are actually making medicine better aside from once-in-a-hundred-year worldwide emergencies?
For what it's worth, there have been a lot of situations like this in the past. Maybe not as fast as this, but tech has displaced jobs so many times like with the cotton gin and computers, but more jobs have come about from those (like probably your job). Now, you can say that this is different but do we really have any data to back that up aside from speed of development?
As for social safety nets: if this affects people as heavily as you think (on an unprecedented, never before seen level), the US will almost certainly put _something_ into place and add some heavy taxes on something like this. If tens of millions of Americans are removed from the work force and can't find other work because of this, they'll form a really strong voting block.
Also consider that things are never perfect. We've had wars around the world for a notable amount of time. Even the US has been in places we shouldn't be for a serious chunk of the last century, but things have worked out. We have a ton of news and access now so we're just more aware of these things.
Hopefully that perspective helps a bit. HN and social media can have "doomer" tones quote a bit. Hopefully some perspective can help show that this may not be as large a change as we think.
Or maybe I'm an idiot, as some child comments may point out shortly.
By definition, we don't have data for events we haven't seen before. So instead I reason as well as I can:
Consider the set of all jobs a human being could do. Consider the set of all jobs an AI system could perform as well as a human being but more cheaply. Is the AI set growing, and if so, how quickly?
Prior technology is generally narrow and dumb: I cannot tell my cotton gin to go plant cotton for me, nor can I ask it to fix itself when it breaks. Therefore I take on a strategic role in using and managing my cotton gin. The promise of AI systems is that they can be general and intelligent. If they can run themselves, then why do I need a job telling them what to do?
Isn't this making the assumption that the stuff that needs to get done is fixed size? New technologies also create entirely new categories of jobs.
"Computer" used to be a profession, where people would sit and do multiplication tables and arithmetic all day [1]. Then computing machines came along and put all those people out of work, but it also created entire new categories of jobs. We got software engineers, computer engineers, administrators, tons of sub-categories for all of those, and probably dozens more categories than I can think of.
I think that there's a very high likelihood with the current jobs that humans do better than computers, most will be replaced by cheaper AI labor. However, I don't see why we should assume that set of things that humans do better than computers is static.
I'm trying to point to the set of all jobs a human being could do, which includes future jobs enabled by future technology.
This is not as nebulous of a set as it sounds because it has real human boundaries: there are limits to how fast we can learn, think, communicate, move, etc. and there are limits to how consistently we can perform because of fatigue, boredom, distraction, biological needs like food or sleep, etc. The future is uncertain, but I don't see why an AI system couldn't push past these boundaries.
Maybe if AI could do all jobs humans could do, we'd setup some system where the AI works and we don't since we tax them or somehow at least part of the created goods and services flows to everyone. Anything AI "creates" is worthless unless it's consumed, and AI being a machine/software won't inherently want to consume anything (like burgers for example).
I also struggle to think about all this, but I imagine if you can flip a switch and everything produced and consumed in the economy could be done in half the time, is that a good or bad thing? If we keep flipping that switch and approaching a point where everything is being produced with almost no human effort, does it become bad all of a sudden?
Somehow we'd need to distribute all this production, I'm not sure how it would work out, but just going from what we have now to half or 25% of effort needed is probably an improvement, at least I'd take that.
We are close to 1.5 degree global warming. And the world is rather busy with war, than actually make a unified effort to change things. That is depressing to me, not that AI can make somewhat convincing background scenery movies (as standalone videos I do not found them convincing, all in all impressive, sure, but too many errors).
I’m more worried that if AI really works out, businesses will end up consuming as much energy as they possibly can using it, because using it more than the competition will provide another edge. It’s not clear how we are supposed to reduce energy consumption. Is there a boundary of diminishing returns that would impose a limit?
I think the big visionaries behind the main ai developers are all hoping to achieve agi, and that it will be able to fix everything, outweighing the short-term vast usage of energy trying to create it
Not worried, I trust in my taste. I still haven't seen anything made by AI that moved me. I'm buying physical books written before AI was a thing, backing up music and film. Visiting concerts and museums. The information and experience in my head will become more rare and valuable compared to the AI slop that will soon permiate everything. Oh your model is trained on the billion most read online texts in the english language? cute. I'm pulling inspiration from places that aren't captured by any model.
Most of my programming job is tightly coupled with the business processes and logistics of the company I work for, AI will not replace me there.
Also I'm not convinced this is sustainable, I'm thinking this will be like GCI where the first iron man film looked phenomenal but where huge demand + the drive to make it profitable will drive down the quality to just above barely acceptable levels like the CGI in current marvel blockbusters.
Yes. These systems are working on a 3D problem in a 2D world. They have a hard time with situations involving occlusion. A newer generation of systems will probably deduce 3D models from 2D images, build up a model space of 3D models, generate 3D models, and then paint and animate them. That's how computer-generated animation is done today, which humans driving. Most of those steps have already been automated to some degree.
Early attempts to do animation by morphing sort of worked, and were prone to some of the same problems that 2D generative AI systems have. The intermediate frames between the starting and ending positions did not obey physical constraints.
This is a good problem to work on, because it leads to a more effective understanding of the real world.
How deep is the pool? Just because you have descended to depths that others were skeptical of doesn't mean there isn't a floor. Besides, these videos are less janky, but still obviously fake, for for sure they're cherry-picked for maximum effect.
Well - the videos in the link gave me same kind of response as literal Latin American gore videos, so line 3 to 5 still applies, and line 1 to 2 still applies to ChatGPT w/GPT-4 Turbo, so... I don't know what to make of this, maybe people like gore videos. Or something.
not to add to the doomerism, but I often wonder about how much AI-generated content I've consumed without realizing it - especially from times before generative AI became mainstream
I don't get this angle at all. To me that's like "organic" food labels. What do I care if my content is "AI" made. When I watch a CGI animated movie there isn't a little artisan sitting in the video camera like in a Terry Pratchett novel, it's all algorithms anyway for like 30 years.
When I use Unity I write ten lines of code and the tool generates probably 50k. Ever looked into the folder of a modern frontend project after typing one command into a terminal? I've been 99% dependent on code generation for ages.
Does it matter to you whether you're interacting with a human on some level when watching a show or movie, specifically on an artistry level?
Maybe some movie you've watched has been spun up by a Sora-like platform based on a prompt that itself was AI-generated from a market research report. Stephen King said that horror is the feeling of walking into your house and finding that all of your furniture has been replaced by identical copies - finding out that all of the media everybody consumes has actually been generated by non-human entities would give me the same feeling
>Does it matter to you whether you're interacting with a human on some level when watching a show or movie, specifically on an artistry level?
Yes it matters to me a great deal. But there's a reason Stephen King made that observation a long time ago. All the actors in a modern Marvel movie look like they've been grown in some petri-dish in a Hollywood basement and all the lines sound like they come from LLMs for the last fifteen years. There's been nothing recognizably human in mass media for decades. 90% of modern movies are asexual Ken doll like actors jumping around in front of green screens to the demands of market research reports already.
I'm not saying the scenario isn't scary, I'm saying we've been in that hellscape for ages and the particularly implementation details of technologies used to get us there ("AI" in this case) don't interest me that much. And in the same vein, an authentic artist can surely make something human with AI tools.
You say that but most things that are commercially produced aren’t made by individuals but via collaborative processes. The AI won’t even get credit, they’ll just use it to dilute everyone else’s contribution.
I felt depressed after seeing this, so I had a long hug with my partner, and remembered the serenity prayer:
"God grant me the serenity to accept the things I cannot change, Courage to change the things I can, and Wisdom to know the difference."
If AI dystopia is coming, at least it's not here quite yet, so I'll try to enjoy my life today.
The writing has been on the wall for that for awhile though...
Every large animation studio has continually been looking for ways to decrease the number of artists required to produce a film, since the beginning of the field.
I don't know who in the upper parts of the various guilds in Hollywood saw this demo'd last summer, but they really really took notice. Those strikes went on for a long time, and it seems that holding out and getting the clause in to exclude this kinda tech was a brilliant bit of foresight. Holy heck, in 5 years, maybe just a year, this kinda tech is going to take nearly all their jobs.
The only reason this is possible is because of the content those people created. This literally doesn’t exist without them. Not sure what you’re trying to say….
The thing that fills me with dread watching these videos is not (much) the thought of how many jobs it might make useless. It's the thought that every single pixel, every movement is fake. There is literally not a speck of truth in these videos, there is nothing one can learn about the real world. Yes they're often "right" but any detail can be wrong at any moment. Just like ChatGPT hallucinating but in a much deeper way- we know that language can be used to lie or just make up things, but a realistic video hits in a different way. For example the video of the crested pigeon- a bird I haven't seen before- is beautiful and yet it can be wrong in an infinity of details- actually, I don't even know if such a bird exists.
Best way to deal with the sense of doom imo is to actually use it. You'll find how dumb it really is by itself, and how much of your own judgment/help/editing is still necessary to get anything usable. It might look like magic from these manicured press releases, but once you get your hands on it, it quickly becomes just another tool in your toolbox that, at best, helps you do the work you were doing anyway, more quickly.
I work with these models professionally (well, I’m a web dev working along side people manipulating these models). When you give it a prompt and it spits out a pretty image, remember that the range of acceptable outputs is very large in that context. It demos very well but it’s not useful outside of stock image/stock video use cases. What artists and engineers actually do is work under a rigorous set of constraints. Getting these models to do a very specific thing, correctly adhere to those constraints, and still maintain photorealism (or whatever style you need) is a very much unsolved problem. In that case the range of valid outputs is relatively tiny.
The more I play with AI the more I realise that The "I" part of AI is just clever marketing. People who are freaking out about AI should just play around with it, you will soon realise how fundamental dumb it is, and maybe relax about it.
AI has no spark, no drive, no ambition, no initiative, no theory of mind, and it's not clear to me that it will ever have these things. Right now, it's just a hammer that can build 100 houses a second, but who needs 100 slightly wonky houses?
Um. A hammer that can build 100 houses a second would be incredibly valuable, both solving and causing some very important problems. So good analogy from my perspective I suppose, but I don't think it supports your conclusion?
AI and AGI are practically two different concepts that most of the industry and the mainstream media are doing a poor job making the distinguishment between them.
Also, 100 slightly wonky houses will sell like hot cakes if each one costs less than 1/100th of a not-slightly-wonky house. People will buy 100 of them instead of 1 and just live in a different one every day/hour so they always notice the novel parts instead of the wonky parts. We've had mass manufacturing for centuries and they always prevail when the trade-offs are acceptable.
While I do not think that it is impossible to get there, I totally agree that this is a key step that current AI is missing. Auto-GPT seemed to be the big thing that can outlay, plan, execute and reiterate complex tasks but ultimately wasn't able to do anything like that. Kind of ironic that it is reinforcements learning that the models seem to be so bad at.
Sorry if my comment didn’t give enough context. I’m not the OP, so I’m not asking any questions.
I was interpreting the parent comment as saying the spark of consciousness only needed a cost function.
Personally, I disagree that our current neural nets are accurate representations of what goes on in the human brain. We don’t have an agreed upon theory of consciousness, yet ML businesses spread the idea that we have solved the mind and that current LLMs are accurate incarnations of it.
More than the functionality of ai replacing current human jobs, I worry what we will lose if we stop wondering about the universe in between our ears thinking we know everything there is to know.
I can see all of my plans for world domination coming together right in front of my eyes. A few years ago I was absolutely certain I’d die without achieving my dream of becoming God Emperor of a united Planet Earth.
Why should we always take the pessimistic viewpoint? Think of all the beautiful things that can be built with something like that. All the tutorials that could be created for any given subject. All the memories that could be relived. Upload a photo with your grandparents, give it context, and see them laughing and playing with you as a toddler. Feed it your favorite book and let it make a movie out of it. I mean, fuck me, the possibilities are endless. I don’t feel depressed. I feel blessed to be able to live in an era when all these marvelous things materialize. This is the stuff we read in science fiction decades ago.
> Upload a photo with your grandparents, give it context, and see them laughing and playing with you as a toddler.
Exactly. People just aren't seeing this. You don't even have to limit the fake memories to real people. Don't have a girlfriend? Generate videos and photographs of you and your dream girl traveling the world together, sharing intimate moments, starting a family. The possibilities are so exciting. I think the people who hate this idea are people who already have it all. They're not like me and you.
Idk about you but I would not consider AI-generated memories of my grandparents even remotely close to being an authentic experience whatsoever. One of my grandparents passed before I was born, so any synthetic depictions of us are fake. That frankly sounds like a post-apocalyptic experience, if not worse than that.
It's literally nothing. Generative images haven't really gotten better at the things people care about, like getting specific details right and matching exact descriptions, and avoiding uncanny animals and humans. There's no reason to think video will be any different. No reason to panic - just take it for what it is: something funny to amuse yourself with for a few hours.
I think you can create an alternate reality with these tools in a way that we havent even thought can alter ones own self.
We have seen this in small scale of social media that ones self esteem.
We will see a new set of problems that would be much deeper. Videos and image that make you believe false reality, reliance of GPT will generate false knowledge.
False reality problems have started popping up everywhere. It is going to be much deeper. I think we are in for a really crazy trip
It's worth considering that throughout history there have been people who have felt this way. That suggests this perception is a natural tendency of humans and it does not have a good track record of turning out be be correct.
The way to generate the wealth required for something like UBI requires large scale technology driven deflation in the cost of goods and services (in real terms).
Advances like this are necessary steps to get us there.
People will be displaced in the short term, as they have for every other large scale advance... Cars, assembly line and so on. Better to focus on progress and helping those disaffected the most along the way
AI is going to be massively deflationary. How useful is UBI when the cost of goods and services approaches zero due to automation?
With that said, I can imagine the federal reserve will then helicopter in money to everyone in order to reach its 2% inflation target, which kinda sounds like UBI.
Right! I keep saying, that at least we have to kickoff the process. Not even the legislative process, but convincing the public that we'll need it eventually (alternatively a whole different system worldwide, but that will be even harder). Will take a long time anyway.
Beware of UBI, simply from the perspective there is no way our puritanical members of society will allow it, and if it does get enacted will have negative ramifications rendering it more of an economic one way trap than a safety net. We're simply to easy to other others, and when those budgeting the entire economy look at the UBI population, their funding will be cut just like they cut education and social services today. I'm afraid of UBI, because I don't trust it's enactment to be fair, honest or worth accepting.
And we have been told that with innovation and disruption, a new breed of jobs and skill sets are created. But we don't know (or are very bad at predicting) what those would be, especially now that the world has 100's millions of people linked to these economies (film, writing, gaming).
Many people (including myself) have bought into the narrative that history will repeat here and things will be better eventually, but not how much has to break first, and it's used as a hammer by OpenAI and probably every innovator who disrupted.
They advertised "Safety" but no "Economic Impact" analysis because the latter is less scary and requires difficult predictive work, the former is just narrow legalese defined by 80-year-old congressman they have to abide by to "release" v1.0. There is at-least a Congressional Budget office(CBO) where the 80-year-olds work, flawed as it maybe...
OpenAI is one major privacy/compliance scandal away from losing that power. I believe it's inevitable, and MS 'will' throw them under the bus when that happens.
Do you feel the doom related to yourself or related to the future of humanity? If it's the first - I can't think of something else than having a money safety net for 6-12 months and having a flexible mind. You can try to learn just in case some phisical skills like electrician if the doon feeling is that bad. If you feel doon for humanity's future, I don't want to be mean, but you shouldn't worry about things you can't control, try to spend more time with nature and with ppl that spend time close to nature
Related to competition - the same thoughts people had when thinking about roman/any empire, how could it break, how could others compete. In the end everything ends, giants like IBM are just shadows of their past success, some are saying google is the next ibm and probably openai will be the next ibm-ed google...
I just don’t find much value of the things that they are generating so I don’t feel that’s a problem. If there is anything this things is positive, is that it reminds us how boring and predictable the daily life of normal people are.
My fear is the alternative reality that these tools could provide. Given the power and output of the tooling, I could see a future where the "normal" of a society is strategically changed.
For example, many younger generations aren't getting a license at 16. This is for a variety of reasons: you connect with friends online, malls cost money, less walkable spaces, less third places.
If I'm a company that makes money based off of subscription services to my tools, wouldn't it be in my best interest to influence each coming generation?
Making friends and interacting with people is hard, but with our tooling you can find or create the exact friend you want and need.
We can remember now that life is beautiful - but what's to stop from making people think that the life made by AI is most beautiful?
And yeah, I've heard this argument before with video games, escapism, etc. I'm talking more about how easy it is to escape now, and how easy it'd be to spread the idea that escapism is better than what is around you.
One thing to remember is that change never stops and we're certainly not in any perfect society right now where we'd want change to stop at. We've seen huge magnitudes of societal change over and over throughout history.
For the most part, the idea of change is rarely inherently bad (even though, IMO, it's natural to inherently resist it) -- and humans adapt quickly to the parts that have negative impacts.
Humans are one of, if not the most, resilient race on the planet. Younger generations not getting licenses, sticking to themselves more, escaping in different ways, etc are all "different" than what we're used to, but to that younger generation it's just a new normal for them.
One day they'll be posting on HN2, wondering whether the crazy technological or societal changes about to come out will mean the downfall of their children (or children's children), and the answer will still be the same: no, but what's "normal" for humankind will continue to change.
> Humans are one of, if not the most, resilient race on the planet. Younger generations not getting licenses, sticking to themselves more, escaping in different ways, etc are all "different" than what we're used to...
As long as they keep having unprotected sex with each other.
In Europe there's no need. Got a licence over two decades ago have never needed to drive. Shops in walking distance, public transport anywhere in the country, convenient deliveries, walkable and cyclable cities.
Meanwhile other places have no freedom from cars, locked into expensive car financing, unable to access basic amenities without a car, and motorists have normalised killing millions of people a year.
This is the same as slacklining over a ravine with no harness. Are the views epic? Yes. Does the adrenaline rush feel good? Yes. Are the consequences irreversible if you happen to mess up? Probably yes. The last point is why there's so much more doomerism compared to OpenAI's previous products. We don't have that harness and we don't know if we'll ever have it.
It's an analogy.. perhaps a bit hyperbolic, but it's within the acceptable window IMO.
When Facebook came out very few considered it an existential risk. Turns out, it has immense power over elections. Elections have consequences for the well being of billions of people on the planet. Not to mention it might negatively impact the mental health of its users (a large chunk of the human population).
More climate change, war, microplastics in our body and now extreme joblessness ?
If I woke up and I saw a headline that said OpenAI has developed and AI which told us how to sequester huge amounts of cO2 then I’d be excited and agree.
Exactly. I'm sick of people advocating changes as "progress" until we get some fundamental baseline sampling of humanity's well-being. When "are you depressed?" "do you contemplate suicide?" "are you exhausted?" go up for 10 years around the globe, then people will look like lunatics saying this is "progress" and maybe we'll have a better conversation about where progress actually is.
The e/acc camp will tell you AGI can solve all of those, which is why AI research needs to move as fast as possible. What they don't tell you is only an aligned AGI can solve it in a way beneficial for humans.
We had a half-assed lockdown for a few months where most people just kind of stayed indoors and saw noticeable environmental improvements world-wide. An unaligned AGI can easily conclude the best way to fix these problems is to un-exist all humans.
There has never in the history of humanity been anything "aligned" in the sense that AI doomers use that word. Yet humanity has had a clear progression towards better, safer, and more just societies over time.
You say as you comfortably type on a thinking machine while indoors sheltered from the elements, presumably without any concern for war or famine or marauding gangs of lawless raiders.
Climate change is looking pretty existential to me. Might be comfortable now, but where I am we're having extreme snow fall shortages which are going to effect our water supplies and ability to farm. God knows what it's going to look like at this rate.
So practically massive destruction of the biosphere so people can sit there smug on there computers in the air-con.
Anyway you reinforced my point. Progress is a good idea, I'm not sure we all have the same ideas about what good progress looks like. Lifting the rest of the developing world by selling them arms and fossil fuels so I can sit on my computer in my room reading smug comments is probably not good progress.
Like you said, it's a feeling. Once you've identified it, just remember you have many many buttons that can be pushed to generate feelings. It's just a program installed long long long ago. Visualize that, breathe and just laugh at that poor bash program.
There is some usefulness to those feelings - this announcement will probably have an impact on your life soon enough. But you cant let every button push and distant threat pull you down can you.
Also remember, life has its own ways: as far as you know, it could also be the beginning of the best days of your life.
Exciting if you don't think about how tons of people are going to be out of work with no safety nets or how easily millions of people are going to be scammed or how easily it is going to be to be impersonate someone and frame them or etc etc etc
Let's say, for the sake of argument, AI could generate absolutely perfect invented videos of arbitrary people doing literally anything. The consequence will be that video will no longer be taken seriously as evidence for crimes. People will also quickly not trust video calls without an extreme level of verification (e.g. asking about recent irl interactions, etc.)
Yes some people will be scammed as they always have been, such as the recent Hong Kong financial deepfake. But no, millions of people will not keep falling for this. Just like the classic 419 advanced free fraud, it will hit a very small percentage of people.
OK, but I did like living in a universe where I could watch video news of something happening in another country and treat it as reasonably strong evidence of what is happening in the world. Now I basically have to rely on only my own eyes, which are limited to my immediate surroundings, and eyewitness accounts from people I trust who live in those places. In that sense, I feel like my ability to be informed about the world has regressed to pre-20th-century levels.
I predict that we will have blockchain integration of media crating devices such that any picture / film that is taken will be assigned a blockchain transaction ID that moment it is generated. We will only trust media with verifiable blockchain encryption that allows us to screen against any tampering from the source.
Video alone has never been considered evidence of a crime in a court of law (At least in the United States). A person needs to authenticate the evidence.
Think what it was like before the invention of the camera, and then after, this is a similar level of innovation. I'm sure a lot of people who wrote books were terrified by the prospect of moving pictures, but everything worked out and books still exist.
IMHO humanity will be fine, decades from now kids will be asking what it was like to live before "AI" like how we might ask an old person what it was like to live before television or electricity.
Consensus reality is already cracking up due to the internet, smartphones and social media. The Media theorist Marshall McLuhan had a lot to say about this well in advance, but nobody listened.
The more people AI displaces then the less customers they would have right? Wouldn't there be some equilibrium reached where it can no longer grow due to falling profits? Let's say they mostly sell B2B, who are those other businesses selling to if no one (generally speaking) has expendable income?
The last two claims always felt wrong to me, because they're assuming a society where these kinds of tools are easy to use and accessible to everyone, yet the society at large is completely oblivious to these tools and their capabilities. Arguably, you couldn't ever fully trust images before, people claimed something was photoshopped for decades now. Instead of something "looking realistic", trusting people and organizations will take its place - when, for example, the BBC posts a photograph, I'm inclined to trust it not because it looks real, but because it's the BBC.
Assuming OpenAI's lobbyists don't convince Congress to ban open models because of {deepfakes, CP, disinfo, copyright infringement} or make it impossible to gather open datasets without spending billions on licensing.
We've been on an unsustainable trajectory for quite a while now. I take hope from things like this. Maybe this time it'll finally be the shock we need to rethink everything.
> Historically, letting technology eliminate their jobs has been a sacrifice people have made for their kids' sakes. Not intentionally, for the most part, but their kids ended up with the new jobs created as a result. No one weaves now, and that's fine.
Ah perfect, all we have to do is consider a vague analogy to a totally different event in the past and it's clear that there's no worries if AI takes the vast majority of human jobs in the next 50 years.
As a side note I shudder to think how many nightmare fuel cursed videos the researchers must have had to work through to get this result. Gotta applaud them for that I guess.
Sentence 1 seems historically illiterate, and I think pg knows how ridiculous it sounds because he walks it back almost immediately. "Historically people made a sacrifice, but not intentionally, for the most part" is incoherent.
> No one weaves now, and that's fine.
Did horses find new jobs when we moved to steam power? Leave aside the odd horse show and fairground ride. By the numbers, what do you think happened?
I can't imagine Paul Graham actually thought through the scenario he's describing. The kids of the parents who lost their jobs, throwing their lives into disarray and desperation, are not going to be the primary recipients of the new shiny technologically advanced careers.
Something I've realized over time is that however good things get, what people really want is to have more than their neighbors rather than any particular quality of life. In many ways you can with a relatively basic job live far, far better than the richest kings could dream of a few hundred years ago. Something as simple as being able to eat strawberries in December was described as literal magic in fairy tales fairly recently. Nevertheless this does not satisfy that need for social prestige and they are profoundly unhappy as a result.
I don't think anything will fix or change this, definitely not UBI, the situation is a fundamental part of the human condition. I share your dread and fear that I will not be able to compete, even if my life improves by all other measures.
The one thing that would dramatically change my calculus is medical advances that significantly push back death and aging.
Does anyone know how to handle the depression/doom one feels with these updates?
I just dread, *shrug*. You don't have to be depressed or doomed, it all comes from premature predictions.
That said, we are surely at the phase similar to that one right before the internet, if not electronics. I genuinely don't understand people who write AI off as yet another Bitcoin or "just an enhanced chatbot". They'll have to catch up on an insanely complex area (even ignoring rocket science behind all that) which will do without them, and right now is the opportunity to jump the ship early.
It's only my nobody's opinion, but I can't see the way in which _that_ could fail. I find it incredibly stupid to think so and to just live your life as if nothing happens. If you're not the ruling class or a landlord, f...ing learn.
If you think EU funds are going to be there funding those social safety nets in the Brave New World where AGI decimates industry... They're not even sustainable as is.
You should feel the opposite, see it as a new tool in your pocket.
Industrialisation and computers/automation took away massive amount of jobs while globally improving people lives, this may possibly (maybe not) do the same.
If in the future, anybody can write the book, create a photograph or a motion picture or an music album with just few words describing what they have in mind, this will be a tremendous productivity improvement and will unleash an overflow of human creativity.
I like to compare it to what Jobs said about computer, they are the "bicycle of the mind" [1].
I tell myself it’s important to try and be less myopic.
One reason is, readiness of tech does not mean it’s being applied.
Another is just like one OpenAI came out of no where, others will too. It’s normal to be focused on a few things to lose sight of that.
Gemini realistically does some impressive things.
What can we do?
Building the tech is important but applying it well for actual adoption is still wide open for the average persons use.
It does seem to mean that what we think might take 5y probably will take 1y in 2024, if not less like 2023. So think 10x, and 10x again as the real goal.
People will not loose their jobs, because you still need someone to input prompts for 10 hours straight in order to get a piece of video you want.
Natural language is not perfectly precise to get exactly what you want from a model like this, and the results remain kinda random. Instead of making video in a traditional tool you will be spinning AI roulette until it generates your desired result. And even then you will probably want to edit it.
I think the deal is that these breakthroughs, aren't really great in any general case. It helps with specific instances of work, and makes individuals way more productive. That by itself might won't end up making some people obsolete. I think by and large, its just going to make people more productive. You probably don't want to work somewhere where that isn't a welcome thing.
Learning and fitness are great ways to avoid the feeling of doom.
Open-source will catch up in <6 months. Note that Meta will ship llama3 anytime. So will Mistral.
I am a PM, and switched to becoming a builder. Enjoy learning, keep building. What people take time to realize is that building things is a habit. As you combine that with ongoing learning, you will enjoy the process and eventually build something to earn a living.
Sure, I can learn how to use these models, but then how do I find things to build? I've always struggled with finding real ideas, and so I just watch AI progress and come up with blanks whenever I try to think of ways to contribute.
UBI, social safety nets, power.. Because of videos? I don't get it. Obvious second-order effect is a devaluation of visual media. Let it all burn, who cares? Go live in the real world. Current mass media culture is an anomaly stemming from hyper-centralization of culture creation. Where we're going is a reversal to the mean.
OpenAI is not going to solve global warming and we will be well into widespread collapse of farming systems, mass migrations and wars of scarcity long before the robots will be doing all the work. AI isnt going to solve any of that so...if you're looking for depression/doom, that's probably a better place to look.
I’m ecstatic about the future of education. I remember many occasions of teachers going, “Gosh, I wish I could show you guys this”. Now, they can with speed and ease. I’m particularly excited for ESL learners to have high-quality low-cost tools on hand for personalized learning for every child.
For most of human history, people didn’t have constant access to art work or videos. Things were fine. Maybe instead of watching manufactured shit on social media, go see a live theatre production. Seek genuineness.
You can live a great life without ever seeing a picture or video.
Nothing on this page has any relevance to employment or UBI. Also, there is strong evidence that UBI doesn't affect employment much one way or the other.
Whether people are employed or not is a policy decision of the central bank and not related to how good AI is.
Yeah today spooked me too. Between this and the large context length on google side and ability to understand video (and thus say video feed for work tasks) it sure seems like the amount of jobs in the firing line just jumped massively
Just bask in the knowledge that if those "social safety nets" and UBI become a reality, you'll have more problems than you do now. You'll look back at this moment in time with fondness. Enjoy it now.
I don't think any of those are things AI can't excel at in 5-10 years. The latter 3 will require integration with robotics, but that's not exactly science fiction these days either, it's just something that maybe nobody bothers to do.
I believe you're right. There may be a better quality of life by looking at the ways of the ancient past. All Ai is going to do is make the rich richer. Why try to compete with this?
There is no UBI coming, govt can barely fund current budgetary needs without borrowing tons of money. If here are no jobs means no tax payers which will further shrink gov budgets. We are on our own as I see it.
All those videos made me so scared of what’s about to come in next few years. India is already a major market for perpetrators of misinformation and with major social media giants only paying lip service to our concerns, with western countries being their major focus, things portend to get even more darker for the poor, the disenfranchised in our side of the world.
UBI was tested during 2020 on a nearly global scale. In the US, the CARES Act which provided stimulus checks for every tax-paying US citizen as well as extensions to unemployment was essentially a giant UBI experiment. Not for AI, but for a giant shift in economic activity where many individuals became unemployed nonetheless.
UBI has three basic properties that the CARES act fulfills none of
1. Covers cost of living for some basic standard (debatable, but should include food, water, and shelter at minimum)
2. Is available to everyone without onerous requirements or means-testing (IE is "universal")
3. Carries a reasonable expectation of continuity such that people can plan around continuing to have it
The CARES act was an emergency measure that absolutely zero people expected or intended to be permanent, it was laden with all the means-testing and bureaucratic hurdles that unemployment generally carries, and it very clearly did not provide adequate support for quite a lot of people
It's meaningless to call something a "test" when it carries none of the properties that proponents of a policy claim would make it desirable. The only perspective from which the comparison even makes sense is from that of someone who's not considered it seriously and come up with a strawman to argue against it (IE something like "UBI is the government gives people some money")
It also seems worth mentioning that I really don't buy the highly political claim that some people seem to view as self-evident: that people remained unemployed longer because they got extended unemployment benefits, rather than as a result of the massive economic shock that prompted that decision in the first place
It may be modeled on UBI, but it's not. Universal basic income is perpetual and unconditional, while the CARES Act was a one-time payment in response to COVID. I'm sure there's still a lot we can learn from it, but I also expect many of the psychological effects will be someone muted.
I can't say I know what the future economy will look like, but I can say for certain it won't just be the current one minus 99% of jobs (with all those previously employed people living in abject poverty). Capitalism doesn't work without customers. Capitalism doesn't work without scarcity. Capitalism depends on a minimum money velocity where paradoxically if you collect 100% of the money, it becomes completely worthless.
To me, it seems guaranteed that will be drastic changes. There will be many attempts at new ways of organizing society with successes and failures along the way. Not out of altruism or desire to share but out of self-interest of those who collect the power afforded by AI and automation.
AGI would give you access to millions of times more resources than you currently enjoy. So I would suggest that you have absolutely nothing to worry about on the income/employment front.
One company having that much power is a different matter, and I address it by looking at how we can distribute GPT training through decentralized and open platforms.
It won't be me or you. Whoever it is, they will not share any of the economic upsides of AI with the public unless they are legally forced -- zero, zip, zilch, nada. Even then, they will keep the lion's share for themselves, and they will use their surplus to shape society to their advantage.
So yes, many millions of us have a big problem to worry about, especially considering how much struggling there already is now.
If the AGI is open source and operates through a decentralized platform, that everyone/no-one owns it, and the upside will be fully distributed to end users.
But even if it stays in private hands, one company monopolizing a technology and keeping it expensive/out-of-reach is generally not how technological innovation works. There is generally intense competition between providers, with each aggressively cutting prices to capture market share.
> AGI would give you access to millions of times more resources than you currently enjoy. So I would suggest that you have absolutely nothing to worry about on the income/employment front.
I'm puzzled as to how you can characterize a description of AGI's functions as "theology." AGI represents the automation of what we would describe as human-level thought, transforming it into a mass-produced service that costs almost nothing to acquire. Consequently, the cost of any product or service that requires human labor is expected to trend toward zero.
We're already witnessing this with the creation of textual and graphical content through ChatGPT. It's now possible to generate various types of text content and a wide range of graphics at the cost of 10 cents for the dozen ChatGPT API calls. And the work is completed in a few minutes, as opposed to several hours. This represents a several orders of magnitude increase in per capita productivity for these specific tasks. As AI technology advances, the scope of applications benefiting from such productivity boosts is expected to widen, which means human civilization will experience a revolutionary increase in productivity, and with it, resource abundance.
The graphics I've produced are absolutely phenomenal IMHO. I find that that textual content depends highly on the prompt, and often does take quite a few iterations to get right.
Often with text, the GPT is more of an assistant/advisor/proofreader for me, rather than a stand-alone creator of quality content.
Sometimes it works really well for emails. Like for producing responses to formal communications. It can save cut the time to respond from 10 minutes to 30 seconds.
What will you do with millions of times more resources than you currently enjoy?
I for one, would be overwhelmed. In the meantime I will be passionate and joyful about the things I like regardless of whether AI can do them a million times better. I have fun doing it.. while the AI is.. just AI.
Personally, I love life, and I expect AI will allow me to spend more of my life taking it in instead of running through the gauntlet of errands needed to stay alive. I also expect it will help us live much longer, which is an absolute blessing considering how precious every moment is.
Stop pathologizing normal human feelings? If you're worried, learn how to use the tools to give yourself a competitive advantage. See steam trains, electricity, microchips, computers and the Web for historical examples of worried people adapting to game changing tech.
I am. I know we're in a situation now as programmers where there is more AI tooling and more programming jobs - but it's difficult for me to see that last.
You could be the best at using the tools, but I think there could be a point where there is no need to hire because the tools are just that good.
Have you considered what an enormous jump in career that is? Or that all the people who already started building AI products are being obsoleted by OpenAI a year after they started?
What concerns me is that Google and OpenAI are racing us to a point where almost no product is valuable. If I can just have AI generate me a booking.com clone, then what’s booking.com worth ?
There is zero chance this tech is going to be locked up by a few companies, in a year or two open models will have similar capabilities, I have no idea what this world looks like but I think it’s less of a concern for individuals and more of a concern for the global economy in the short term.
Outside of all of this, yeah we’re either going I have to adapt or die.
Well, alone I was able to launch a software company in 2010. From accounting to nginx, everything was automated.
Alone, maybe I will be able to launch a unicorn in 2030. It’s just tools with more leverage. The limit is just the computing resources we have, so we’ll have to use computing resources to calculate how much earth resources each of us can use per year, but that seems a usual growth problem.
That is my point though, I mean it's good you could launch the company, I just don't know what happens to the large companies that employ a lot of people. Seems like they're heading into dangerous territory.
Times change. People adapt. Happened before, will happen again. Some adapt willingly, some hesitatingly.
Current AI wave is a corporate funded experiment desperate to find something compelling beyond controlled demos to economically recoup the deepening hole in their balance sheet. The novelty has begun to wear off, the innovation has started to stagnate and the money running out. The only money making innovation left to be seen is in creating more spam. Thats where I see this wave headed.
OpenAI has proven it's a shit company with rotten fundamentals playing with a shiny new toy. They will crash and burn spectacularly. As many before have done in various fields.
My reaction after using any AI tool from the last couple years to do anything meaningful ends with just a big facepalm.
As Antonio Gramsci said: “Pessimism of the intellect, optimism of the will.”
The forces of blind or cynical techno-optimism accelerating capitalism may feel insurmountable, but the future is not set in stone. Every day around the world, people in seemingly hopeless circumstances nevertheless devote their lives to fighting for what they believe in, and sometimes enough people do this over years or even decades that there’s a rupture in oppressive systems and humanity is forever changed for the better. We can only strive to live by our most deeply held values, within the circumstances we were placed, so that when we look back at the end of our lives we can take comfort in the fact that we did the best we could, and just maybe this will be enough to avert the inevitable.
Honestly? Just an embrace of cheerful, curious nihilism. Between this and climate change, we are entering interesting times, and remembering that I’ll be able to “opt-out” at a time of my choosing, and then embracing the time left with happiness and curiosity.
“Glad did I live and gladly die, And I laid me down with a will … Here he lies where he longed to be; Home is the sailor, home from sea, And the hunter home from the hill.”
And yet you're dodging the question: how did it end up for the Luddites, specifically? You're not a hypothetical person in the far-future that has had time to adapt to this technology, you're a person in the here and now, and the wave is rushing towards you.
Well, smashing the machinery certainly didn't get them their jobs back. Worrying about the machinery didn't get them any new jobs either. I guess some of them fell destitute because they were too angry or unfortunate, some others got jobs doing something else, and others got jobs operating the machines that had replaced them.
There are winners and losers, but it's absurd to think that we should avoid progress to protect jobs.
My preferred method for ensuring a just transition during times of technological progress (and to eliminating involuntary unemployment generally) is a Federal Job Guarantee http://www.jobguarantee.org/
Personally, the advances in AI have just made my job easier and allowed me to get more done. I don't see that trend changing either.
It's just a digital CONTENT! machine, it's not a big deal. CONTENT! consoomers - rejoice, producers - keep producing. Where does the sense of doom come from? What power does this company even have? The power to churn out more movieslop? Is that powerful? We've had decades of that, it's tiresome.
Touch grass, tend your garden, play with your kids, drink a beer, bake a pie, write a poem, take a walk, carve a sculpture, play a board game, mend a sweater, take a breath. Relax.
The climate is burning. The Amazon is likely collapsing sooner than we expect. There are plenty of wars around the globe and a major multi country conflict brewing in Africa. Western politics are laughable, and still the best if you want to be free to say what you want and have rights. Inequality is incredibly high and rising. And so on.
So there are a lot of things to be depressed about before you get depressed about a little increase in misinformation and idiocy on the interwebs. I mean... things like polio and the measles are literally back to fuck with us because people are so fucking stupid they think vaccines are a bad thing.
A lot of the things you mention are happening because of rampant misinformation. Something these tools will help create more of as an unstoppable rate.
The point I was trying to make was that there is no reason to worry about us setting fire to a fire. Of course you're correct, it'll get worse, but it's not like it wasn't terrible to begin with.
If anything the optimist in me is hoping that all this "AI" generated content is going to make the internet so useless that our society (well the part that doesn't believe the earth is flat and that Bill Gates has mini clones in the vaccines) finally get away from it. In my region of Denmark our local police posts their immediate updates on twitter, which was fine when everyone could see them, not so great now that you need an account. I very rarely care about what they post, but around new years a fireworks container blew up near here, and I had to register (and then later delete) a twitter account to figure out if I had to worry about it or not. It'd be nice if the impending doom of fake content is going to move our institutions and politicians away from big tech SoMe platforms and it just might if they become useless.
I think it's naive to attribute all of the world's problems to "misinformation". You can give everyone the same information; in fact, we all already have the same information. But perspectives will vary, and there will be conflict.
I don't attribute it all to misinformation. I attribute it all to greed but misinformation is a great tool for the ruling class to satisfy their greed.
Unless you think everyone else is lying, people can and do find meaning in their lives, in the activities they love. AI has no bearing on that (just careers) so there's no reason to believe it would "accelerate" anything.
Thanks for taking the time to explain. I do kind of think people are lying (to themselves). Ignorance is (temporary) bliss.
I'm seeking lasting meaning; not 'meaning' that dissolves after a season, or at best, at the end of a life.
What I meant by 'accelerat[ing] the realization' is that all of our earthly desires will more readily be fulfilled, and we will see that we still feel empty. AI is like enabling a new cheat code in the game of life, and when you have unlimited ammo the FPS becomes really fun for a moment but then loses its meaning quickly.
I can get that too. We're the arbiters of meaning in our lives.
> earthly desires will more readily be fulfilled, and we will see that we still feel empty
You misunderstand if you believe that the secular perspective on meaning is to reach it through sensualism, by consuming. That is all that AI can more readily fulfill.
Read the Bible. Specifically Revelation, 1/2 Thessalonians, and Daniel. If you haven't before, you'd be surprised how much of what's taking place now is prophesied.
Many people, rightfully, (over-)react to the American caricature of Christianity (mega churches, Kenneth Copeland, etc.) as the definition of what it is (that's arguably the deception hinted at in the Bible), but reading/trusting the raw word—what's referred to as "sola scriptura"—is remarkably helpful in navigating what's taking place.
> Read the Bible. Specifically Revelation, 1/2 Thessalonians, and Daniel. If you haven't before, you'd be surprised how much of what's taking place now is prophesied.
…said every doomsday preacher since the Bible was written.
I'm not a preacher and I don't subscribe to any church/denomination. I think, generally speaking, religious leadership in the world is in a state of apostasy and is guilty of leading people away from God/Christ's message (which the Bible prophesies would happen).
Read them. They're all quite similar, mostly changing in tone or structure. I recently built an app to side-by-side ESV, KJV, NASB, NLT, AMP, and ASV translations and they're all very similar. Even obscure translations follow the same structure and message (they have to, doing the opposite is warned against in the Bible).
Protestantism copied/copies a lot of the non-Biblical tradition propagated by Catholicism (e.g., Sunday replacing the Sabbath, recognition of non-Biblical holy days, claiming "Jesus nailed the law to the cross" when the exact opposite is stated in the Bible, etc).
That's neither here nor there when the entire point was to eschew non-Biblical tradition maintained by the Church. Notwithstanding that there are various Protestant sects and some do not practice what you're accusing them of. It's a large tent, the beliefs aren't that specific.
You're completely missing the point. Unfortunately, it seems you lack the capability to see the point—to wit, you're not special. So, as the Lord commanded us in Isaiah 1:16 — I wash my hands of thee.
And religion in general is in a state of apostasy and is guilty of leading people away from reality. That's why it was invented after all, and it sure has done it's job. You'll never find a place more full of delusional self indulgence and aggrandizement than a church, regardless of which religion or denomination they subscribe to.
> You'll never find a place more full of delusional self indulgence and aggrandizement than a church, regardless of which religion or denomination they subscribe to.
Correct, which is why I avoid religion (in the institutional church sense). I'm a bit of an odd duck because I came to the Bible after having been a practicing Buddhist for several years and generally being unexposed to Christianity (save for a lukewarm exposure to Jesuit Catholicism) or any religion growing up.
Having lived a mostly-secular life and only later (at age ~30) coming to Christianity, I can confidently say that in regard to reality, it's taught me that it's highly subjective. What most people consider as "reality" is just the interpretation of what they see that keeps them from losing their mind. For some, reality is being an unhinged hedonist, for others it's planting a garden, and for others it's generally just "trying to be nice and getting along."
Personally, God/Christ (and by extension, what's recorded in the Bible) is the interpretation of reality that makes the most sense to me. In practice/study, I've found that it maps 1:1 with what I see while also filling in the blanks on things I can't explain (e.g., the ability for the human body to heal itself, the pace/behavior of nature, or humanity's unrelenting drive to destroy what it doesn't/refuses to understand).
I guess it’s nice that you believe that, but the truth of the matter is that you are about as close to traditional right-wing mainstream Christianity as it’s possible to conceivably get. Like, if I were to imagine the archetypal Christian hypocritical sinner… it would be you.
Nothing protects your other examples from corruption.
> why choose one of the most corruptible and corrupted approaches?
Because when the non-prophetic elements of it are applied to life, all of the anxiety, fear, and dread you feel evaporates. It's only when you view it through the lens of a "church" or "leader" (read: group) that it loses its meaning.
I've read the I Ching and it lacks a religious/church element which leads to the conclusion you've had. It's not until people take it and turn it into something it isn't that it loses its value.
Arguably, Christianity, due to its claims, has become weaponized. Interestingly, this very outcome is prophesied in the Bible (which, personally, cements my faith in it what it prescribes).
The world is changing before our eyes. It's exciting, sure, but I am also deeply afraid. AI may take humans to the next level, but it may also end us.
...and our future lies in the hand of venture capitalists, many of whom have no moral compass, just an insatiable hunger to make ever larger sum of money.
Even though this is highly impressive, I think it is still important to stay rational and optimistic to see the other side of the coin.
Every industrial revolution and its resulting automation has brought not only more jobs but also created a more diverse set of jobs. Therefore also new industries are created. History rhymes, the ruling fears in such times have always been similar. Claims are being made but without any reasonable theories, expertise or provable facts (e.g. Goldman Sachs unemployment prediction is absolute bs). This is even more true when such related AI matters are thought about in more detail. Furthermore, even though employing tens of millions of people probably, only a few industries like content creation, movie etc. are affected. The affacted workforce of these industries is highly creative, as they are being paid for their job. The set of jobs today is big, they won't become cleaning staff nor homeless.
This technology has also to proof itself (Its technical potential is unlimited but financially limited by the size of funds being invested, and these are limited)
Transition to the use of such tools in corporations could take years, depending on the type and size and other parameters. People underestimate the inefficiencies that a lot of companies embody - and I am only talking about the US and some parts of Europe here. If a company did their job for 2 decades the same way, a sudden switch does not happen overnight. Affected people have ways to transition to other industries, educate themselves further and much more. Especially as someone living in the west, the opportunities are huge. And in addition, the wide array of different variables about the economy and the earth, and everything its differing societies are, comes into play: Some corporations want real videos made by real people; Some companies want to stay the way they are and compete using their traditional methods; Corporations are still going to hire ad agencies - ad agencies whose workflow his now much more efficient and more open to new creative spheres which benefits both customer and themselves. They list could go one endlessly.
Lots of people seem to fear or think about the alleged sole power OpenAI COULD achieve. But would that be a problem, would "another Alphabet" be a problem? Hundreds of millions of people benefited and are benefiting today from their products. They have products that are reliable and work (This forum consisting of tech experts is a niche case, nearly all people don't care at all if data on them is being used for commercial purposes). Google had a patent guaranteed monopoly on search. But here we have: an almost non patented or patentable market, an open source community, other companies of all sizes competing, innovation happening and much more. It is true that companies like OpenAI have more funds available to spend than others, but such circumstances have always driven competition and innovation. And at the end of the day, customers are still going to use the best product they have decided to be so.
I know I may be stating the obvious but: The economy and the world is a chaos system with a unpredictable future to come.
I can’t wait for the day I can strap on my Apple® Vision Pro® 9 with OpenAI® integration and spend all my time interfacing (wink) with my virtual girlfriend. Sure my unlit 3 by 3 meter LifePod® is a little cramped and my arm itches from the Soylent® IV drip, but I’ll save so much time by not having to go outside and interact with legacy humans!
That sounds like a nightmare! You have to buy so many products, you have to keep them charged and you re still missing on so much. You should instead get the Neuralink plug-in pod with builtin feeding tube and catheter
I have a book I've written (first three parts available free at https://www.amazon.com/Summer-of-Wonders/dp/B0CV84D7GR). Is there some way to feed this to the tool and get an animated version out? Or this with some other tool(s)?
Also (since it's been a while): there are over 2000 comments in the current thread. To read them all, you need to click More links at the bottom of the page, or like this:
https://news.ycombinator.com/item?id=39386156&p=2
https://news.ycombinator.com/item?id=39386156&p=3
https://news.ycombinator.com/item?id=39386156&p=4[etc.]