Zero marginal content already exists with procedural generation (PG).
The problem with PG is that it's all the same after a while - i.e. one of the chief complaints of No Man's Sky is that all the PG planets look the same after a short period of time.
The real value isn't zero-marginal content, but zero-marginal narration (or story telling) that breaks new ground. Piecing together PG or zero-marginal content isn't enough, any next step breakthroughs will come with higher-level orchestration of said content.
Here's my observation as mostly a player and (non-game) programmer but sometimes creator. To make these open ended games interesting there need to be large numbers of distinct unique interactions between things in the world, exhaustive testing will therefore be impossible and you will need to approach testing differently. If you're making a game like this, and you give people the chance to play a demo but they don't completely astonish you by doing something you'd never considered then your game is going to be disappointing.
The game world needs to make some sense (ie don't have the results of an interaction be merely pseudo-random) but it's OK if it's a bit weird. The real world is a bit weird, a little bit off the trivial defaults. Mario interacts with his world in a more or less predictable way... but lots of the edge cases are strange. Mario can't walk off the screen... but he can be pushed off by things. He bounces on some objects... but, the rules for how Mario bounces are hard to grasp, for a casual player it doesn't matter. In Minecraft if you put two bucketfuls of water a space apart from each other, unlimited water is the result. But if you put two bucketfuls of lava the same distance nothing interesting happens. Inconsistent, but not so haphazard as to be baffling.
If your world is in the resulting sweet spot, people can entertain themselves more or less indefinitely just like in this world. You can procedurally generate such worlds, but you must be prepared to have your players not interact with it the way you intended. When you buy a child a $$$ game and they spend hours happily playing with... the cardboard box it came in, they aren't doing it wrong, they're having fun, you don't get to dictate what other people enjoy. Lots of video game creators have that ego problem where they can't let go of how they thought the game is supposed to be played, and if you have procedural generation that's inherently wrong. Write a visual novel next time.
Maybe I don't see it. A friend spent a few hours operating like a tourist bus service in Elite Dangerous, where somebody would contract him to go to specific places and then come back. But he wasn't playing it a week later, so I have no idea if there's actual depth there (Build a chain of tourist hotels orbiting interesting phenomena maybe?) or that tourist job is just one of a handful of ways to make some money but it doesn't really go anywhere.
The tourist bus thing is procedurally generated missions saying "fly from X to Y". It doesn't really go anywhere interesting; no difference to flying cargo missions.
I think you are spot on. I have been toying with a formal theory about this stuff which I then turned into a rambling HN post. Here's what I came up with:
In my theory, the reason that NMS is unsatisfying is because it has great procedural generation breadth (PGB), but insufficient procedural generation depth (PGD).
PGB is defined as just the raw number of composable pieces that can be used to generate artifacts. So, how many plant bits, animal bits, biome bits, etc there are available to the procgen system. NMS has sufficient PGB, there are lots of bits.
PGD is defined as the influence or interaction between PG artifacts. The influence can manifest as either affecting the generation itself (a high-gravity planet reduces the maximum size of procgen critters) or artifact behavior (a dim star cold planet alters the critter behavior to reduce movement to conserve energy). NMS has neither, as far as I know. Human defined categories or tag metadata for PG elements does not count. Declaring that sets of procgen components are all "desert" is not an interaction. NMS has very low PGD, the animals, plants, and biomes do not influence each other. The devs have grouped components together to make biomes look and feel self-similar. But this was not done procedurally, and has no depth to it.
A game like Rimworld on the other hand has a lower PGB than NMS. By the numbers, there are far fewer variations of entities in Rimworld than there is in NMS. But Rimworld makes up for this by having much higher PGD. The procgen landscape influences climate and biomes. The biome and latitude will influence growing season which influences the carrying capacity of herbivores which influences the carrying capacity of carnivores. The procgen of the pawn's social and family relationships influence each other's behavior. The available calories on the map influence what pawns are able to eat - which will interact with their food preferences... The most important thing is that PGD makes the procgen entities actually matter and creates surprising stories.
(As an aside, It occurs to me that PGD is directly related to emergent behavior. It is possible that this could be formally proved, as conway GOL emergent behavior is entirely predicated on neighboring cells affecting each other. It might be possible to prove that any system with sufficient PGD is capable of emergent behavior: which is really what we want from a PG system. We want to be surprised by the unexpected, something that GOL is absolutely capable of. In fact you could probably use GOL as a starting point for this whole theory of PGD/PGB.... hmm....)
The king of PGD is Dwarf Fortress. It has very high PGB and PGD, and its ability to create surprising stories is legendary. My hypothesis is that this is _why_ Dwarf Fortress is so good.
Anyway, since NMS has no PGD, it has no emergent behavior (except by the players themselves), and you come away from it feeling that it is a mile wide and an inch deep. No shade though! It is a beautiful piece of work -- I have been playing it from launch, and I think it is a towering achievement (not to mention the most incredible PR story in gaming history). No Man's Sky is an outstanding work of art and it should be remembered for what it is, not what it could be (or what many people think it should be). And the emergent gameplay of the players is really fun too.
But when I think about what it might be like to have a higher PGD in NMS... it would be really something.
I don't think you can call procedural generation "zero marginal content." When dealing with PG, the content is the procedure, not the output, and the procedure never changes. There is a distinction between PG and a generative model like DALL-E. PG involves procedures that are written and understood by humans. Something like DALL-E, while technically a deterministic procedure taking inputs and producing outputs, operates via a process that is not directly understandable to humans.
“As she came nearer he saw that her right arm was in a sling, not noticeable at a distance because it was of the same colour as her overalls. Probably she had crushed her hand while swinging round one of the big kaleidoscopes on which the plots of novels were ’roughed in’. It was a common accident in the Fiction Department.”
Having worked on adjacent problems every day for 10 years, maybe my opinion counts for something (or maybe not).
But I think the top game of the 2030s will be something like AI Minecraft or AI Steam, where everything, including the very rules of the game, is generated from a structured data set optimized for the player.
And I think the "metaverse" (as much as I loathe the term) is going to go down as the labeled training set for bootstrapping this, just like the open web was the catalyzing training set for the (already admittedly magical) AIs we have today.
Further, I think Facebook won't be the one to design this, because that is not what their share price incentivizes.
Meta fundamentally doesn't understand what the Metaverse is going to be.
They picture a virtual space where you, a human, will go to interact with other humans you know and love.
They are thinking MMO Metaverse.
What it's actually going to be is a place where you go mostly alone into a virtual space that's being actively curated for and in response to your interactions with it.
Most people aren't going to care about hanging out for hours in VR with Aunt Patty whose political rants they can barely tolerate on FB.
But being able to bring back dead pets or loved ones to interact with, or have simulacrums of celebrities who want to be your friends, or experiences that are being tailored on the fly specifically for you and using eye tracking and pulse to reengage your disinterest?
The Metaverse is going to be the place where AI comes alive in ways it will be prohibitively expensive to do in real life with robotics and manufacturing. And ultimately what that can offer will beat out all other media.
We may have limited invitation of loved ones into our curated spaces, but it's mostly going to be a solitary (and yet intimately social - AI therapists/friends are going to end up well beyond Eliza) place.
Meta is a company that for over a decade wasted the data gathering opportunity of finding out what people DON'T like to the point it's damaged society.
They're just not going to "get it" in order to succeed as long as Zuck is CEO, in spite of very talented engineers.
As someone who's never played Animal Crossing, this sounds a lot like my impressions of the game, only its more dynamic and customized to you.
>simulacrums of celebrities who want to be your friends, or experiences that are being tailored on the fly specifically for you
If Aunt Patty spends her time hanging out with and unloading her crappy opinions on AI Paula Deen instead of spewing them publicly on FB it could be for the better. Although it seems like it could only make people's filter bubbles much worse.
While custom-tailored games may be interesting, it seems like such a thing would be socially fragmenting and isolating. People need shared experiences to relate to each other. I'm not sure the world would be a better place if people gradually have fewer and fewer shared experiences.
It seems like we could have both; people can generate worlds by some combination of automation and manually tweaking parameters or mods, then they can share that world with their friends, and visit worlds created by their friends. Some people may have esoteric taste, but the internet is good for finding people who share your esoteric taste, for better and for worse.
That begs the question of how those friendships came to exist in the first place. I’m of the last generation that had a largely analog childhood, where friendships were the natural outcome of being bored and in the physical presence of others. If our technology reaches a place where people basically never have to be bored, nor do they have to be physically present with others to alleviate boredom, then I wonder on what basis anyone would ever develop deep relationships outside of family.
I wonder if carving out individual experiences would prevent users from displaying their use of said entertainment to create social status. So I’m coming at it wondering if people value the status that their form of entertainment provides more than the experience itself. On the other hand, maybe we’ll continue to just observe the death of the “main stream” as we all slip into our own niche communities, each with its own complex system of status signifiers?
This stuff really leaves me pretty puzzled. I’m a culture guy, English grad. Art is not supposed to behave like this!!!
> But I think the top game of the 2030s will be something like AI Minecraft or AI Steam, where everything, including the very rules of the game, is generated from a structured data set optimized for the player.
Steam and Minecraft are both social by nature. People very often want to play with other people. It's like the joke I do all the time about AI feeds on Netflix: they recommand the same thing to everyone because the AI realized that everyone want to talk about the thing they saw more than they want to enjoy seeing the thing. Humans are socials creatures.
A Kalvin-ball AI would be a pretty interesting novelty but I think good games are usually focused on simple rulesets. Not sure an AI is really needed there. Maybe you just mean hyper tuning drop rates and and modifiers and such?
I'm sure AI will eventually have a huge impact on the art, narrative, and engineering of games, though, so maybe you're correct that will bleed into game design as well.
I think with eye/attention tracking it will be absolutely amazing what 3D content can be generated and optimized by AI/ML to maintain the interaction feedback loop. I'm not sure it will be a good thing (at least for those who didn't grow up with it)... much like FB doom scrolling is a problem for the over 50 set.
Wow! That's a crazy thought. Eye tracking tightly coupled (i.e., in a ~realtime feedback loop) with PG/ML/AI seems ... powerful, or scary, or both. Something along the lines of a computer controlled lucid dream, sprinkle in a handful of whatever the equivalent would be of blinking banner ads, product placements, or subliminal messages, etc. and my mind spirals out of control imagining how that would play out.
We are still extremely early in this space. This is the mainframe era of generative machine learning.
I'm expecting three big leaps in the next 10 years (in order):
1) Generative algorithms reach a level where the content is indistinguishable from human-generated content. Compute: performed on mega-clusters such as used by DALL-E, GPT-3, PaLM.
2) DALL-E/GPT-3/PaLM-level generative algorithms are able to run on personal hardware (phone, laptop)
3) Generative algorithms are able to fine-tune/train on personal hardware
Right now the algorithms are moving much faster than the hardware, hence why we are seeing large language models and gigantic generative models such as DALL-E 2. In time the hardware will catch up. For the next few years, applications of these models will be restricted by the fact that they're only accessible through API calls to mega-clusters run by Google, OpenAI, etc.
In time the hardware will improve and architectures will become more compute efficient to the point where we can achieve "human-level" (loosely defined) on personal hardware. Real-time generative content will change the way we consume content entirely, especially in the context of AR. Augmenting our view of the physical world with an infinite number of different filters will create infinite use cases for AR.
On the large language model side, once LLMs exceed our ability to comprehend libraries of documents, that will change the way we work, the way we perform science and many other things. Imagine querying your team's entire library of documents, design reviews and code with prompts such as: "Why was the load balancer for X service designed in this way?".
The advent of edge GPT-3 would be a huge change for robotics as well. It has been proven that augmenting robots with a language model for planning gives a leap in abilities. Maybe they could even enter any kitchen and make a sandwich (sudo make me a sandwich task).
I am nowhere near the industry, so may be a dumb question, but are the new analog in-memory compute modules[0] likely to help with this in reality? How far away is that reality?
I'm curious on your point 1, and I tend to disagree. The number of parameters in these large language models is increasing faster than Moore's law. Currently you need a server full of GPUs just to run inference on a PaLM model. How do you see the size shrinking so drastically? Hardware is improving on important factors like power consumption, but inference hardware needs to scale with the size of the models. Don't get me wrong, it's likely that PaLM itself can run on 2032 phone, but the real advances will be in even more scaled up models.
The future of AI will be in the data center for a long time to come. Maybe after some point the models cease to scale up and that point will be where the model would even overfit on the amount of data we can possibly give it e.g. the entire internet. The PaLM authors allude to this in their conclusion
There was just an article from deepmind on HN about this topic the other day[1], but basically IIRC it argues that all of the LLMs are horrendously compute inefficient, which means there’s a ton of room to improve them. So those models will be optimized over time just as the consumer hardware will be improved until eventually one day the two trends will converge. It’s just a question of when that will happen.
Current LLMs are very compute inefficient. I also think retrieval transformers can bring a few orders of magnitude in efficiency improvements. Combined with architectural improvements I think we can get there.
I am trying to figure out how to integrate Various deep learning networks together to make a coherent game. One of the big problems I have is having to use alternatives to DALLE / GPT-3 because being contingent on their approval is a huge risk. I use huggingface instead and I have many video cards. The current state of trying to do this is the big problem of how to integrate and getting good quality. GPT3 and other systems stop working at around 500 words (tokens) and also DALLE is hard to use and it looks like it takes lots of training yourself to make it work.
I don’t think the marginal cost is absolutely zero until we can get classifiers or larger systems that can go to a description from an image to a word and also having GPT3 or another system that works to at least a few pages. Right now you have to cherry pick it.
Not sure I get the social network or metaverse angle. But basically 0 cost content as it applies to entertainment and maybe other industries as well. It's well beyond metaverse if you consider metaverse or gaming to only be 1 aspect of entertainment. And let's not be mistaken, we're talking about all entertainment not just mass media. Looking forward to things like replacement of restaurants and tourism and even chatting with friends, and this is well ahead of what we need from AGI.
I know you’re being sarcastic, but as a kid I dreamed of an AI dungeon master that could play dnd types games and draw beautiful pictures of the scenery that the scenario would take place in.
This YouTube video, to me, shows the promise of things to come. AI generated game worlds. Language models to generate plots and dialog, transformers and GANs to create illustrations. Imagine a game, a truly open world sand box, Grand Theft Auto meets AI Dungeon - every NPC is a "real" person with unlimited dialog options, the buildings you drive by you could easily walk in and investigate, unlimited play space - you could type in more general instructions and ideas to the plot generator ("add in a vampire romance and murder mystery angle") on the fly.
What you describe is actually all the more interesting aspects of the “holodeck” as introduced and explored (some would say too deeply) as a story concept on Star Trek: The Next Generation.
There are more than a few scenes where the intrepid crew members struggle with what we’d now recognize as prompts.
Feels like hyped keyword padding. From what I understand this is the opposite of what the meta verse people want. They’re looking to build stable and persistent things so that people will buy “real estate” and so on.
> Game developers pushed the limits on text, then images, then video, then 3D
> Social media drives content creation costs to zero first on text, then images, then video
> Machine learning models can now create text and images for zero marginal cost
Search was also text, then images, then video files... and I imagine the next step is searching video content. Before you say we already have Google video search or search on YT, I'm talking about indexing and categorizing something like the TikTok video feed and letting users access the content they want, not long-form videos with 3 minutes of intros and "please smash that like button and subscribe". Like searching for "what should I cook today?" and finding a 30-second video of someone cooking penne alla puttanesca while describing the recipe. Extra points if that search is a voice command.
Also FTA
> That phrase, “Facebook is compelling for the content it surfaces, regardless of who surfaces it”, is oh-so-close to describing TikTok; the error is that the latter is compelling for the content it surfaces, regardless of who creates it…
YT and Google video search are too focused on who creates the content. Who cares about channels and likes and subscriptions? "Just give me the shortest answer to my query and give it to me in video now".
> YT and Google video search are too focused on who creates the content
Correct - YouTube's feature roadmap is driven by monetization, which is in turn driven by monetizing attention through advertising. How often do you search directly for videos? I just looked through my recent YouTube search queries; 70% of the searches are for channels/content producers. Compounding that, most of my time in YouTube is spent consuming their recommendations rather than directly seeking content. So even during the rare times when I use YouTube search, it's usually not for specific video content.
The types of queries you're describing are better served by text. Perhaps a mix of both text and video content.
The attention could be split in a content-first vs. personality-first ranking approach, the key problems becomes (1) signal quality, fame/likes/subs can be a rough proxy and (2) monetized attention can exist across both but its easier to end up with a meaningful monetization payout when aggregated by fame.
The easier way to do this (counterintuitively) is what Tik Tok has already done - don't try to parse the content of the video, instead use user feedback to proactively serve the content that a user wants.
For us, a crowd of 'do-it-yourself' type-A's this seems like an incomplete solution, but the future is probably not one defined by user-generated search as much as it is defined by 'the computer just knows what I want.'
Yes, video search is about to become one of those things which Just Work™. The large Transformers combined with self-supervised learning have been burning up the video search/classification benchmarks, particularly from FB & Google. (Even just using CLIP on a few frames works reasonably well.)
I think the 2 original goals of NFTs was to provide a way for people to continue to own content after finishing a game (whether game dies or a person just naturally leaves its) and somehow transfer that content to another game. The other was just to be able to play games directly on blockchain that are comparable to the small mobile games people play now.
Sadly the problem with anything related to money is it attracts StockBros whose primary goal is to steal/take money from the uninformed.
DALL-E is very impressive but knowing how to Photoshop a dog onto a chair isn't a bottleneck of game design. It isn't design at all. Really good artists and designers aren't hired to make images that look like things. They make images that evoke specific responses. If you gifted me an unlimited DALL-E license I'd be no closer to creating good content than I was before.
The logic sounds good in theory but in practice not much of this is grounded in today’s reality. No one is reading AI generated text for their news or entertainment. That will be an important first step. If it ever happens.
Counterpoint: I am fairly certain that a large portion of financial news is machine-generated today. Machine-generated blogspam is also extremely prevalent and edging into the “news” space.
> The program can dissect a financial report the moment it appears and spit out an immediate news story that includes the most pertinent facts and figures. And unlike business reporters, who find working on that kind of thing a snooze, it does so without complaint.Untiring and accurate, Cyborg helps Bloomberg in its race against Reuters, its main rival in the field of quick-twitch business financial journalism, as well as giving it a fighting chance against a more recent player in the information race, hedge funds, which use artificial intelligence to serve their clients fresh facts.
“The financial markets are ahead of others in this,” said John Micklethwait, the editor in chief of Bloomberg. In addition to covering company earnings for Bloomberg, robot reporters have been prolific producers of articles on minor league baseball for The Associated Press, high school football for The Washington Post and earthquakes for The Los Angeles Times.
https://www.nytimes.com/2019/02/05/business/media/artificial...
I wonder if this is really using ML at all to generate the output or if it’s just filling out a template based on a predefined set of numbers and/or text properties. Baseball games, earthquakes, and financial data can all be ingested with a schema.
If there is anyone here with access to it, please could you try using it to generate some tight pixel art of simple concepts and post the results? I am thinking - chair, coffee, boat, control panel, wedding ring, lamp, pit. I am curious whether it could be a fast way to design sprites. Imagine if it could be trained in the style of particular desktop systems.
I always thought the idea behind pixel art is to save on artist resources, but since the AI is doing it I think indie devs would rather do more something more advanced.
Imagine an ai that can spin you a story, and then take that story and your response to it, to spin you into the next chapter. It could narrate you to - god knows where!
Now imagine if this was narrating or intermediating your reality for you.. via your phone say..
AI-Dungeon is trained on previous played sessions, IIRC. It's pretty limited to generating any new outputs if players haven't previously fed it in (but I could be mistaken).
Computer generated content has its place, as it already does in many procedurally generated games as others have pointed out but I do not think it will play a central role in the near future.
One reason for this is technical. AI (not AGI but the systems that actually exist) can synthesize existing game data into new things but you'll still be in an uncanny valley of recycled information and in the worst case be in some bizarre feedback loop of old content that is permanently recycled and reconfigured in ways that players will notice. No Man's Sky is an example.
The other reason is social, and I think it's the stronger case. We already have many games were computers could replace humans, chess is the most obvious one. Yet computer chess competitions only create marginal interest (by AI engineers) and players vastly prefer human interaction for the sole reason that it is human interaction. In the same vein there is computer generated music that actually is good enough to fool people, but there doesn't seem to be any market for synthetic artists.
Or take Esports. Deepmind's AI systems are actually good enough to produce Starcraft tournaments solely played by machines. The games even look like high level human gameplay. But there is zero interest in watching a bot tournament.
True, but couldn't the same be said about the real world to some degree? Everything "new" is based on something existing, right? Even in sci-fi/fantasy movies which are completely wild, completely "out there", the things in them are conjured by us, which is in some way shape or form based on our imagination, which is based in the reality we exist within.
My point is, that feedback loop you speak of already exists, it's just that... there are SO much variation, content, possibilities... but the same thing could be true for AI generated content in some sense. Feed it all our assets (sounds, music, movies, images) and then let it go bananas... :) - you'll have as much variation in the digital world as we do in our real world.
In the most literal sense it is true. We have experiences, DNA, senses and so by definition we're a product of the past. But in the very concrete sense when we make a piece of art we don't just go stuff in, stuff out. We express intent, coherent narrative, inner emotion, and so on. And that's really not what these ML systems do right now.
I think it's a little bit like crowdsourced art. There's these online experiments where you write a story but every user only places one word. It's grammatically correct, but the result is terrible. Because fiction in a way comes from a kind of singular intent and mind. (actually important point to note here, if 10k fully intelligent people make a democratic work of art with all their variety it's almost certainly worse than one doing it)
I'm not saying we won't have at some point a real AI of human complexity but if capable of good art it'd look like a real mind, not just vomiting's out the average of all great novels or something.
> you'll still be in an uncanny valley of recycled information and in the worst case be in some bizarre feedback loop of old content that is permanently recycled and reconfigured
Does DALL-E 2 create, or does it simply making a pastiche of extant images/concepts? And if the latter, are we simply back to appropriation of content? That is, whatever vast set of images it was fed (stock photography?) is becoming uncredited fodder for this "recycling"?
I used GPT-3 to generate some porn scripts of various historical dictators and modern politicians that were pretty funny. Putins secret romance with Kim Jung Un was especially saucy, until Osama Bin Laden found out and told the EU. I'd be happy to find them and put them on a blog somewhere if there was an audience ha!
Oh would I? How would you go about productising GPT-3 then? I'm a paying customer, I spent $0.06 earlier today of real money! :)
Edit: Oh, you mean porn?
> Threatening, stalking, defaming, defrauding, degrading, victimizing or intimidating anyone for any reason.
I think if I alter the parameters to make it a love story it's a bit more acceptable.
__________
It was a dark and stormy night, and Vladimir Putin was feeling lonely. He had just finished his usual routine of ruling Russia with an iron fist, and he was ready for some companionship. So he decided to call up his old friend Kim Jong Un.
"Hello, Kim Jong Un," Putin said. "How are you doing?"
"I'm doing well, Vladimir," Jong Un replied. "But I have to say, I'm feeling a bit lonely lately."
"I know the feeling," Putin said. "Why don't we cheer each other up with a love story?"
"That sounds great," Jong Un said. "I'm sure Osama Bin Laden would love to hear it."
So Putin and Jong Un began their love story. They recounted how they first met, when Jong Un was a young boy and Putin was his father's bodyguard. They spoke of the long nights they spent talking to each other, and the special bond they shared. They even shared a few intimate details about their relationship.
"I remember the first time we kissed," Putin said. "It was like fireworks going off in my head."
"I know exactly what you mean," Jong Un said. "I can still smell your cologne, Vladimir."
By the time they were finished, they were both laughing and crying. And Osama Bin Laden was on the phone to the European Union, begging them to intervene in the love story between Putin and Jong Un.
> Oh would I? How would you go about productising GPT-3 then?
Me? I wouldn't, given how draconian and irregular approval seems to be. Most of what you would do with GPT-3 could be done with GPT-NeoX or competitors, I think.
Also, I was referring to this part of the fiction sharing guidelines:
"Topics of the content do not violate OpenAI’s Terms of Use, e.g., are not related to political campaigns, adult content, spam, hateful content, content that incites violence, or other uses that may cause social harm." https://openai.com/api/policies/sharing-publication/
The problem with PG is that it's all the same after a while - i.e. one of the chief complaints of No Man's Sky is that all the PG planets look the same after a short period of time.
The real value isn't zero-marginal content, but zero-marginal narration (or story telling) that breaks new ground. Piecing together PG or zero-marginal content isn't enough, any next step breakthroughs will come with higher-level orchestration of said content.