We need DataStore data migration support!

I’ve been developing my game Shard Seekers full time since late 2015. One thing that’s been worrying me more and more over the years is the fact that I need to support my game’s current binary format forever. I need to be able to experiment with new systems / features without committing to them forever for my game to succeed long term; I’m just too hesitant to add new types of content to my game because of this.

Some way to iterate over millions of DataStore keys and process them would be a relief. I’m ready to leave a computer running for months / years and pay for the overheads if I need to. Some way to download a list of all keys in a DataStore, or keep track of a meaningful position when iterating over them would be needed for this to work for mid-size games. Adopt Me sized games would probably need a distributed solution that utilizes game servers.

It might be possible to achieve something like this by storing player user ids with save timestamps in an OrderedDataStore and iterate starting from the oldest saves, but that ship has sailed.

My game’s save system is set up so I can load a save regardless of if a player is present, and migrating a player’s data would look like this:

  1. Load the data
  2. Deserialize the data from a string to the game representation
  3. Serialize the game representation to a new “migrated” string
  4. Deserialize this new migrated string to a second game representation (to test stability)
  5. Verify that these strings are the same (to test stability)
  6. Save the data

I know I can migrate data safely. What’s unsafe is supporting old save formats without doing unit tests on old saves.


I’m okay with having if saveVersion < x in my code to support changes to the save format; this is fine and makes sense for short term changes. What I’m not okay with is the fact that my codebase will irreversibly grow long-term and become more difficult to maintain every time I increment the game’s save version. I could skip everything except currency purchases and start fresh, but I would feel horrible deleting everything players have created, especially when it could just be possible to migrate that data to a new format.

From a maintainability perspective, my problem is that any mistakes when modifying and maintaining legacy save code would be catastrophic for players who come back to the game after a long time, and save stability is extremely important. I have a huge collection of old saves that I run stability tests on before I publish the game for this reason; These test saves bloat my save file, and this testing process will only get slower as I make more changes and need to add more samples to it.

Here are a few examples of old code that I still need to maintain:

image

image

It’s easy to make design mistakes early in development, and right now DataStores are simply unforgiving.

31 Likes

I’m in the same boat - I’ve made a “chain” of modulescripts to continuously update people’s data based on wherever they left off. It’s absolutely a pain, and it’s easy to screw up and forget to make another one of these module scripts for EVERY version of my data (every update that stores a new value needs to become a new version of data).

module -1 will convert the data from the original version (before I had realized I would need to track version numbers :roll_eyes: ) to version zero, then zero to one, then one to two, etc. Whenever a player joins, I don’t know when the last time they joined was, so I have to start off their upgrade system from wherever the last iteration of their data was. It quickly becomes a mess and if, God forbid, I accidentally mess up on that code that updates the data version, I would be totally screwed.

I do think on a separate note though that there should be some sort of immediate recovery option if you migrate your datastores to update them and then it turns out your code didn’t do what you expected. Yes, your code should be properly configured to not have to deal with that problem, but with the potential damages a feature like this could cause if you got a little sleepy or didn’t fully know what you were doing, there should be a way to back up your datastores just in case.

7 Likes

Right now I have 12 versions starting at 1. I’ve been trying to design formats to depend less on low level serialization code, so versions can be easy to support. Here’s how character customization properties are stored:


I could potentially only save TailColor for mermaids, WingColor for sky elves, BeardStyle for male characters, etc., but this was too complex to support if I wanted to change something down the line. This simplistic design was only achievable because I first learned from my mistakes.

A while back I added a new inventory/equipment format that does really efficient delta encoding that saves a lot of space for players that have tons of characters. It’s really compact and fast to serialize, but I didn’t fully consider how much more complex this format will be to support if I decide the game needs the inventory/equipment system set up differently in the future.

Most Roblox games use a single fixed system, add tons of content using that system, then move on to the next project. I don’t want to throw my project away just because it’s too difficult to maintain. I’ve been working the same project for nearly 5 years because it continues to do well and I want to create a truly massive game that delivers great experiences. Ideally I would just set up my formats perfectly in the first place, but the need for data migration increases as a project evolves and its priorities change.

I need to be able to explore new content vectors with the assurance that it will be possible to migrate data someday. A single simple format is only possible once I’ve learned from my mistakes and know exactly what the game needs.

2 Likes

@bigcrazycarboy is on the right track. Do not try to interleave version migrations. A single migration must transform a state only into the immediate next version. These migrations can then be run in a chain to update any previous version to the current.

Validate the correctness of the format during migration. If something goes wrong, don’t leave yourself blind. Halt the migration, lock the DataStore from being overwritten, and report the problem somewhere that will let you inspect it manually.

Here’s a dumb example:

local Migrate = {}

-- Migrate version 1 to version 2.
function Migrate[1](state)
	-- Rename foo to bar.
	state.bar = state.foo
	state.foo = nil

	-- Finalize migration.
	state.version = 2

	return state
end

local Validate = {}

-- Validate version 1.
function Validate[1](state)
	-- Field 'foo' is a number between 0 and 10.

	if type(state.foo) ~= "number" then
		return false
	end
	if state.foo < 0 or state.foo > 10 then
		return false
	end
	return true
end

-- Validate version 2.
function Validate[2](state)
	if type(state.bar) ~= "number" then
		return false
	end
	if state.bar < 0 or state.bar > 10 then
		return false
	end
	return true
end

-- Perform validation and migration.
local state = LoadData(player)
local ok = true
while ok do
	local validate = Validate[state.version]
	if not validate then
		-- Something wrong with version number.
		ok = false
		break
	end
	if not validate(state) then
		-- Something wrong with state.
		ok = false
		break
	end
	local migrate = Migrate[state.version]
	if not migrate then
		break
	end
	state = migrate(state)
end

if not ok then
	-- Tell the player.
	-- Lock player's DataStore to prevent overwriting.
	-- Tell analytics so the corrupted state can be inspected.
	-- Report each migration step.
end

Specify your formats. A specification gives you something to look to when you need to refresh your memory. It also provides a source of truth when something is incorrect.

The absolute first thing I do when designing a format is to write out a document that describes its structure. Over the years, I’ve developed a small language to make this process easier:

Changes to the format should be specified as well. Reserve enough bytes for your version number that you can increment it every time you make a change, even if it isn’t breaking.

6 Likes