Procedural Content Generation through a Data-Driven Approach

Automating the design of procedural content

Hughes Yip Liren
6 min readApr 20, 2022

Introduction

A data-driven program contains the algorithm and logic as its base code. Instead of hard-coded values, it works by running a set of external data through the systems as inputs. This process, known as data-driving, allows programs to be versatile and flexible to changes.

During the course of my UXG2176 — Advanced Scripting class, we learned about how data-driven games store their data for various in-game statistics such as ‘Gacha Rates’ and ‘Character Stats’. They are often read and written in text file formats such as JSON and XML. Through these data formats, the values within these complex structures can be easily modified on the developer’s end to improve the user experience.

In this report, the following research will cover the topic of how game designers can employ Procedural Content Generation through a data-driven approach.

Procedural Content Generation (PCG)

Procedural Content Generation (PCG) in games refers to an automatic method of creating game content through the usage of algorithms and a pseudo-random process, resulting in a wide range of possible gameplay spaces. Commonly found in roguelikes, the benefits of using PCG in these games include creating dynamic content, increasing replayability, and even saving development time and money.

Data-Driven PCG in Video Games

Pokémon Mystery Dungeon: Rescue Team DX

Pokémon Mystery Dungeon: Rescue Team DX is a roguelike video game developed by Spike Chunsoft and published by The Pokémon Company in 2020 for the Nintendo Switch. As a mission-based game, the player can take on a variety of jobs such as rescuing Pokemon, delivering items, or escorting clients within dungeons.

A dungeon room filled with items, traps, and hostile pokemon

Taking after the traditional game of Rogue, the layout of the dungeon, along with the items and pokemon that can be found in the dungeon are procedurally generated. While the items and pokemon in each dungeon are random, they come from a list of data that dictates the possibility and rate of spawning.

Unlike the previous versions of Pokémon Mystery Dungeon, this sequel was released on the Nintendo Switch, allowing the developers to release downloadable content through Wi-Fi in the form of software updates. With this approach, the spawn rates of new and existing pokemon and items can be easily modified for balancing if their stats were data-driven.

Similarly, this approach allows the developers to patch any detrimental bugs that escaped testing during the development phase, as opposed to older generation games in devices like the Game Boy Advance, where all changes to the content are finalized upon shipping. Combined with the current age of technology, data-driving in game design allows the designers to continue making changes to create a better experience for the players.

Dynamic Difficulty Balancing

When it comes to data-driven systems in video games, there are various advantages one can apply in terms of design. For instance, the difficulty of a game can be dynamically adjusted based on the performance of the player.

The zones of Player Skill in relation to Difficulty visualized

It is common for the difficulty of a game to increase along with its progression over levels or time. Upon selecting a difficulty level, the parameters of this increase often do not stray from its predetermined course. While these difficulty levels are optimally designed to fit its target audience, they may still lead to frustrating moments for some experienced and inexperienced players alike, as not everyone follows the same learning curve. Thus, some games opt to dynamically balance the difficulty of a game based on the performance of a player.

Upon clearing a level, factors such as the remaining hit points and the time taken to clear a level can be taken into consideration and summed up to give the player a score as their overall performance. Based on the score, the difficulty of the next level can then be balanced to suit the player’s ability. Elements such as the frequency of enemies and duration of gameplay can be changed while keeping consistency with the rules of the game’s world. With dynamic balancing, games can keep their players interested throughout the experience by providing a good level of challenge.

Procedural Content Design

Procedural generation is used to automatically create large amounts of content in video games. However, it is arguably not the same as procedural content, the taxonomy of PCG.

In the context of roguelike games, procedurally generating a dungeon may automatically create the playing field, but it does not necessarily produce any form of content design. In PCG, procedural content refers to the algorithmical creation of game content with limited or indirect user input.

So, what is considered procedural content and how does it tie into data-driving? For example, instead of predefined parameters, the size and complexity of the aforementioned dungeon can be modified based on the game’s progression, whereas the rules of generation consist of purely algorithms and logic. Adding dynamic difficulty adjustments into the mix, the speed of the game’s progress can be entirely deterministic based on the player’s ability to perform in accordance with the ever-growing complex dungeon. This means everyone starts progressing through the game at the same speed, but the pace for performing players will increase and help them progress at a significant rate.

From there, the connection between PCG and data-driven progression can redefine how content should be gated. For instance, the rate of spawning a chest room can be inversely proportional to the size of the dungeon. On the contrary, the rarity of the loot will increase as the size of the dungeon grows. This implies a dungeon of 10 rooms would have a 10% chance of spawning a common chest, as compared to a labyrinth-sized dungeon of 100 rooms suggesting a measly 1% of encountering a chest filled with rare loot. Putting gameplay into perspective, new players are more likely to find a chest lying around with basic items that will help them progress through the early game, while an expert will have to grind their way through a monstrous labyrinth to get their hands on that legendary S tier weapon.

Summary

In conclusion, this report hopes to have provided a glimpse of how PCG can benefit from a data-driven approach. To reiterate, some benefits of data-driving in games are

  • Easily access and modify in-game stats on the developer’s end.
  • Dynamically balance the difficulty of a game to constantly provide a good level of challenge for all players.
  • Adjust the content available to players based on their progress.

However, a data-driven approach may not always be the answer. In a practical world where time and resources are limited, game developers may not have the leisure of running tests and balancing iterations. Instead, relying on heuristics can help to solve problems and speed up the decision-making process. Ultimately, the decision of implementing data-driven systems boils down to how complex the design of a game has to be.

--

--