/ arbitrary

Sketch of Procedural Story Generating

Yes, I am going to try and update the blog bi-weekly, or fortnightly if you're British and used to the ambiguous use of bi-weekly.[1]

A few weeks ago on Twitter I posted about experimenting with some procedural story generation, something I find insanely interesting, and would love to work more on (maybe in the summer). My computer science friends seemed, in general interested in the project and said I should make it its own Twitter account. I liked the idea and have every intention of doing it once I jump the exam hurdles in April and hopefully bump my GPA out of its sorry state. However, I've decided to sketch out an outline of how it works right now, and maybe talk about where I want it to go when I can think of it as a real project and not something peripheral.[2] This post, as a result is a lot of musing, and much less abstract than the previous entry. I'm not sure on the tone of the blog yet, so we may be seeing entries really ranging on the philosophy vs. computer science spectrum.

For the uninitiated, a rough definition of procedural content generation in computers can be outlined as follows

Procedural content generation (PCG) is the programmatic generation of game content using a random or pseudo-random process that results in an unpredictable range of possible game play spaces.[3]

Usually, this is applied to dungeon and world building though there have been a number of interesting attempts at story writing, and even more interesting examples of puzzle generating. Also, usually, since apparently 'usually' is the word of the day PCG is applied to games, though I'm halfway certain it wouldn't be hugely difficult to apply it to other fields, but primarily, it is used in games. Or to entertain myself. You know, whichever.

The approach I have taken is much more pseudo-random than true random, and at the time of writing, instead of a real database runs through a flat file containing, something in the neighborhood of 150 sentences with particular number ranks assigned to them based on a very vague scale which I've called 'intensity' since I've decided to ignore my training in philosophy and not be careful with my language.

The reason for this is fairly simple, since I'm only piecing together a basic idea. I'm going on the fact that there are a bunch of general properties to a story, they usually (again), have a beginning, middle, and end, with the beginning and end being less 'intense' than the middle which is usually where the most climatic moment occurs. For the sake of my need to worship at the alter of arbitrary, I decided that ending are generally slightly more intense than beginnings. From there they sentences are randomly picked and placed in order, but each one is weighted which influences the probability of it being picked. As I tweeted, the current system is rudimentary, but its meant to show how I plan on smashing English class and Math class into one beast.

And what a terrible beast it is.

As you can probably see the idea is simple at its absolute pinnacle, but lays a foundation for a few future ideas, which looks like the following:

  • Move from the sentence being pre-made, to an actual sentence generator. Even though this looks like a picky detail right now I don't think I'll be able to get anywhere near the variety I want with a bunch of simple sentences that came out of the cesspool that is my brain. The problem with this is currently that the formula is not as simple for creating a coherent sentence and will probably need a cumulative intensity score, which is to say that particular, forceful words will be selected for middle-of-the-story climatic moments, and less dramatic ones for the beginning and the end. The problem being the story will still have to make sense. Speaking of...
  • 'Intensity' is an arbitrary scale that really does nothing to help the generation of interesting content. I feel bad having implemented a system sometimes. Though, I think there is some reasoning behind scaling sentences based on their impact. There is also a whole bunch of subjectivity with intensity, and yes, I know that fiction is subjective anyways, but I'd like to think that if you're talking about punching a walrus, this is more dramatic then filing the appropriate paper work. The problem is borderline sentences -- like punching the appropriate paperwork. A way to smooth the sharpness of my personal opinion within the system would be to crowd source[4] intensity ratings. Of course a vague definition of intensity would make this difficult/impossible.
  • Keeping related ideas together, currently the system demonstrates awesome levels of attention deficit disorder by skipping around different sentences, that talk about different subjects with different degrees of sense. This would tie into the idea of creating things like characters and settings if these stayed consistent and didn't flicker back and forth like a weird psychedelic trip.
  • In general more mystery and imagination, to read this article for the 30th time and use some of its ideas. Minus the mention of prolog and backtracking because that takes me back to a cold, cold place.

Despite all this I want to keep generation as (pseudo)random as is possible, many PCG story generation systems rely on sort of a story outline, or other firm structures to keep the computer from getting too whimsical, I'd like to direct the flow to coherence, but away from any real structure. I'm hoping the process will be vaguely organic. However there needs to be a structure to have a story, but rather than traditional approaches which have a human setting out that structure, I want the system itself to build the skeleton and then flesh it out.

Anyways, there's the long version of what I had hoped to be a short introduction to the project. I am firmly opposed to brevity.


  1. This absolutely threw me off, and I spent most of last week figuring out if biweekly was twice a week or every other week. ↩︎

  2. Read: Distracting ↩︎

  3. http://pcg.wikidot.com/what-pcg-is ↩︎

  4. I feel like a total drone saying "crowd-source" in an un-ironic context here ↩︎