Structured world generation with LLMs
- Introduction: LLMs for creative tasks
- Why is this even hard?
- World Generation with LLMs
- What can you do with this?
- Conclusion
- More world generation examples
Introduction: LLMs for creative tasks
I haven’t seen a lot of discussion about how to use LLMs for fun and creative tasks. This is fair, because it’s actually kind of hard to do, particularly in a quantitative and publishable way. On the other hand, I know for a fact there’s a community of amateurs having a lot of fun with LLMs.
I’m going to show you a system I created for systematically generating fictional worlds. The goal was to design worlds that are diverse, modular, and extensible. They could be used for something like a D&D style roleplay or a MUD.
Why is this even hard?
The big issues with LLMs regarding creative tasks are, in no particular order:
- Lack of coherence, i.e. losing the plot or forgetting things (Large models do much better than small models)
- Repetitiveness. Models will recycle phrases verbatim. “Shivers down your spine” is a classic example. This gets worse over time because the presence of the phrases make the model more likely to repeat them in the future.
- Reliance on cliches. Every town will be “Greenwood” and every dwarf will have “iron” or “beard” in their name.
- Limited context size. LLMs can only see a certain amount of input text at once, so they can’t know every little detail all the time.
The upshot is that all your text ends up looking kind of “samey,” and important details get lost or forgotten. I think most people have the same experience when they try roleplaying with an LLM for the first time. The first handful of messages blow you away, and then as you start to notice repeated phrases, and when GPT keeps forgetting what items are in your backpack, the magic fades. When you try to restart, you get similar village names and characters on the next go, and then the next one, and then you start to feel like you were tricked.
This isn’t a nefarious, intentional trickery. It’s just the nature of LLMs that they tend to look best at a glance and fall apart during extended use. This is largely due to the issues I outlined above.
People do all kinds of funky stuff to get around these issues. There are weird samplers like XTC to prevent cliches or DRY to prevent repetitive phrases. These have pros and cons. Additionally, some models are better than others at producing diverse outputs, and there is ongoing work to improve in this regard.
World Generation with LLMs
An interactive world graph for a fantasy world
Try clicking some of the nodes! You can drag them around too.
The leftmost node is a handwritten overview of the world. Every other node is a subregion generated by an LLM. Click on a node to see the description of the world.
The world has a tree structure. As you move right along the tree, the nodes describe subregions of their parent region. For example, Elyria’s Crucible, Cryovale’s Grip, and Ignis’ Edge are all subregions of Aldreon. Curmongo’s Shack is a subregion of Dungwater.
The tree structure was inspired by the paper Generative Agents: Interactive Simulacra of Human Behavior. In this case, the world itself wasn’t AI-generated, but it had a tree structure that made it very convenient for the LLM to utilize during simulations.
How do you generate it?
The generation is all done locally using llama.cpp and a quantized version of the Nemotron 70B model. In my experience, Nemotron 70B is pretty good for creative tasks and often surprises me with neat twists. It does suffer from repetition issues and a bit of purple prose.
The prompt
The prompt for generating new child nodes looks something like this:
You are a creative writer generating content for an RPG game set in {root_name}. The user will ask you to write descriptions of locations. Your responses should be diverse, creative, interesting, and occasionally funny. Focus on the facts, contextualizing with common knowledge about the location.
The current location is {path_summary}.
{path_description}
Divide the {current_node_name} into {num_children} subregions. Each subregion should have a name, a list of descriptive tags, a one-line summary, and a long description. Format each entry like this:
1. Subregion Name
- Tags: Tags here
- Summary: Summary here
- Description: Description here
In the example above, {root_name}
is Aldreon. {path_summary}
is the path of nodes from the root to the current node. For example, it would be Aldreon->Elryia's Crucible->Luminari Wilderness
if you were generating subregions of the Luminari Wilderness.
{path_description}
is the description of every node mentioned in the {path_summary}
. Essentially, it’s increasingly-specific information about the current region.
The tags are meant to be a list of adjectives that loosely summarize the subregion. These are generated first to “seed” the model with creative inspiration. I will explain this further in the sampling section.
This prompt sets the model up to generate subregions, but there’s no guarantee the outputs will be very creative. I do some strange sampling to help ensure that the output will be diverse.
Sampling (this is the secret sauce)
I prefill the response up to the name,
**1.** **
Then generate the name. Then I continue prefilling to where the tags will be generated:
**1.** **The Shongor's Den**
* **Tags:**
And generate the tags.
The real trick, though is that I do something funky when generating the name and the tags.
I sample one word at a time (using spaces as stop tokens). For each word, I sample the first token in the word uniformly from the $n$ most likely tokens, where $n$ is usually between 20 and 100. I also do some regex filtering to ensure that none of the tokens have funky characters like dollar signs.
This works because the words in the name and the tags don’t need much coherence. The whole point of them is to be creative and diverse. Then, the name and tags work as a creative anchor for the rest of the generation.
This uniform-first-token sampling keeps the model from constantly generating similar subregions.
The brief summary and long description are generated using standard sampling methods. You can generally use higher temperature with the summary and lower temperature with the longer description.
Other implementation notes
I created a WorldNode class in Python that you can build these trees with. You can save and load them from yaml files. When generating subregions, it can traverse the tree to build the prompt and autofill things like {path_summary}
and {path_description}
. At some point, I’ll probably put this code up on Github so other people can play with it.
But all the secret sauce was already described, so you could implement this yourself if you wanted.
You can also ask for particular subregions. For example, I asked specifically for “Dungwater” and described it as “a shoddy town with very little to offer.”
Shortcomings
The major limitation is that every generated node can only see the nodes leading back to the root. The only exception to this is when sibling nodes are simultaneously generated. In this case, they’re all “seen” at the time of generation, which helps prevent duplicated nodes or contradictory information.
In general, though, disparate branches are liable to have contradictory information. It’s something to keep an eye on. There’s not really a good way to mitigate this, but it tends not to be a huge issue since most nodes eventually share some parent information, and distinct regions often handle distinct subject matter anyways.
What can you do with this?
I dunno.
This tool would be pretty handy for rapidly brainstorming areas of interest in a fictional world. If you need some distractions in a local forest, you can generate them on the fly.
I can see this being useful for D&D type creative tasks. What I’d like to do is add more structure so that the LLM can act as a Dungeon Master while you explore the world. You could generate regions on the fly while you explore. Ideally the world nodes could also hold additional locally-relevant information, like characters, quests, items, etc.
I’ve also used the sampling techniques to generate characters and had some success. Here’s an example:
Name: Diggory Sweetbark
Gender: Male
Race: Gnome
Alignment: True Neutral
Occupation: Gardener and Herbalist
About: Diggory Sweetbark is a gnome with a deep love for plants and nature. He was raised in the heart of the Verdant Expanse, a vast forest filled with magical flora and fauna. Diggory’s parents were renowned botanists, and they passed on their knowledge and passion to their son. He spent his early years learning about the various plants in the forest, their properties, and how to cultivate them. As he grew older, Diggory began to explore the forest beyond his family’s home, discovering rare and valuable herbs that he could use to create powerful potions and remedies. His reputation as a skilled herbalist and gardener spread throughout the Verdant Expanse, and he became known as the “Green Thumb of the Forest.” Despite his love for nature, Diggory is not opposed to using his knowledge to help others, even if it means venturing into the dangerous world outside the forest. He has a dry wit and a tendency to speak in riddles, often leaving his companions scratching their heads. His appearance is that of a typical gnome, with a round belly, bushy beard, and bright, twinkling eyes. He wears a wide-brimmed hat adorned with various plants and flowers, and his clothes are made from natural materials, blending in with the foliage around him. Diggory carries a staff made from the heartwood of an ancient tree, and he uses it not only for support but also as a tool to help him tend to his plants and defend himself if necessary.
Dialog: “Ah, you seek the rare Moonwhisper Bloom, do you? It grows only under the light of the full moon, in the shadow of the ancient willow. But be warned, the willow’s tears are poisonous, and the moon’s light can be deceiving. Tread carefully, young one, for the forest is a fickle lover.”
My hope is that I can combine several tools of this sort to get an increasingly complete and coherent experience.
Conclusion
LLMs can be alright at creative tasks. They do best with lots of structure and carefully-added randomness. If anyone out there wants the code, let me know and I can hustle up about getting it onto Github. Otherwise, I hope to have more updates as I figure out how to piece some of these tools together and make something more fun and interactive.
More world generation examples
A cyberpunk city
I used ChatGPT to generate the root node, then just generated a bunch of subregions at random. This is a cyberpunk-themed world named Vesper City. This was a one-shot; I didn’t even review the nodes before posting. If you really want to use these worlds for something real, I imagine you’d do your own touch-ups or regens as you go along.
A prison for people with superpowers
Another one-shot with a root node generated by ChatGPT. It has the same structure as the previous example because I was too lazy to randomize how I generate the nodes.