DataStore system; Improvements, vulnerabilities

Hi All,

I’ll keep this brief. I’m looking for feedback on my saving system. I’ve omitted many parts of the code to keep this post succinct, but I’m looking for specific feedback on how the overall architecture can be improved to mitigate the risk of data loss. Any feedback at all would be much appreciated! I’ve also commented the script to make the intention between certain parts clear.

--[[ SERVICES ]]--

local Players = game:GetService("Players")
local DataStoreService = game:GetService("DataStoreService")
local RunService = game:GetService("RunService")

--[[ VARIABLES ]]--

-- Datastore
local DataStore = DataStoreService:GetDataStore("MyDatabase")

-- Players that have been loaded in
local loadedPlayers = {} -- This table is used to constrain saving for only fully loaded players
local savingPlayers = {} -- This table is to check whether a player is already being saved

-- Overall player save variables
local plrSaves = {} -- Stores result from GetAsync()

--[[ FUNCTIONS ]]--

--[[ Saving Functions ]]--

-- General Save
local function Save(plr)
	-- Checks if the player is loaded before saving
	if not loadedPlayers[plr.UserId] or not plrSaves[plr.UserId] then return end
	
	-- Checks if the player data is already being saved
	if savingPlayers[plr.UserId] then return end
	
	-- Indicates that the player is currently saving // constrains further calls 
	savingPlayers[plr.UserId] = true
	
	--[[
		OMITTED SECTION 1: 
		   * This is where I save each of the player statistics, inventory, and placed items
		   * If anything returns nil, or false, I exit without saving to prevent data being overwritten
	]]--
	
	-- Pseudocode, but basic logic of the current system
	if anythingFromAboveReturnedNil or anythingFromAboveFailed then return end
	
	local key = "plr-"..plr.UserId
	local success, err = pcall(function()
		DataStore:UpdateAsync(key, function(oldData)
			
			-- Creates default data if the player is new
			if not oldData then
				oldData = {["Current"] = nil, ["Save1"] = {}, ["Save2"] = {}, ["Save3"] = {}}
			end
			
			local newData = oldData
			
			--[[
				OMITTED SECTION 2: 
		   			* This is where I save all the tables, variables, etc. generated from OMITTED SECTION 1
		   			* I save everything into the table newData and return this table
			]]--
			
			-- Updates local copy of saves
			plrSaves[plr.UserId] = newData

			return newData
		end)
	end)
	
	if not success then warn("Failed to overwrite data: " ..tostring(err)) end
	
	-- Resets saving bit to allow further calls
	savingPlayers[plr.UserId] = nil
end

-- Saves all players // simply calls above function
local function SaveLoadedPlayers()
	for _, player in pairs(Players:GetPlayers()) do
		Save(player)
	end
end

--[[ Loading Functions ]]--

-- Loading function
local function Load(plr)
	local key = "plr-"..plr.UserId
	
	local succ, err = false, ""
	local savedData = nil
	local count = 0

	-- Repeat fire for GetAsync() // 5 repeated calls in case of failure
	repeat
		succ, err = pcall(function()
			savedData = DataStore:GetAsync(key)
		end)
		if not savedData then wait(6) end
		count = count + 1
	until succ or count == 5

	-- Failed to read data
	if not succ then
		warn("Failed to read data: " ..tostring(err))
		return
	end

	plrSaves[plr.UserId] = savedData or {["Current"] = nil, ["Save1"] = {}, ["Save2"] = {}, ["Save3"] = {}}

	return savedData
end


--[[ EVENT CONNECTIONS ]]--

-- Loading player data on player join
Players.PlayerAdded:Connect(Load)

-- Saving player data and resetting tycoon on player leave
Players.PlayerRemoving:Connect(function(plr)
	-- Checks if the player is already being saved
	if not savingPlayers[plr.UserId] then
		Save(plr)
	else
		repeat wait(2) until not savingPlayers[plr.UserId] -- waits until save is complete 
	end
	
	--[[
		OMITTED SECTION 3
			* Here I delete everything the player has placed since the player should be 
			  guaranteed to have been saved from above block.
			* Deleting items while the player saves can lead to corruption, hence the 
			  repeat wait(2) used above.
	]]--
	
	-- Clearing memory from tables
	plrSaves[plr.UserId] = nil
	loadedPlayers[plr.UserId] = nil
end)

-- Saving all player data on server shutdown ~ no need to clear memory
game:BindToClose(function()
	
	-- Multithreaded saving
	for _, player in pairs(Players:GetPlayers()) do
		-- Already saving player // no need to resave
		if savingPlayers[player.UserId] then continue end

		-- Saving new player
		coroutine.wrap(function()
			Save(player)
		end)()
	end
	
	if RunService:IsStudio() then 
		wait(5)	-- small enough wait to save my own data when running on studio
	else
		wait(30) -- waits maximum allowed time  
	end
end)

--[[ INIT ]]--

-- Autosave player data
do
	local autosaveTime = 480
	while wait(autosaveTime) do
		SaveLoadedPlayers()
	end
end

This looks pretty good. One of the better datastore scripts I’ve seen. Just a few tips tho:

  • Extraneous memory allocation in your :UpdateAsync method:
local newData = oldData

There is no point in assigning a new variable to oldData. It doesn’t copy the table since the variable is simply referenced. If you were attempting to deep copy the table, create a recursive function like so:

local function deepCopy(original)
	local copy = {}
	for k, v in pairs(original) do
		if type(v) == "table" then
			v = deepCopy(v)
		end
		copy[k] = v
	end
	return copy
end

This will actually copy the table and not be referenced to oldData. So your revamped callback would look like:

	local key = "plr-"..UserId
	local success, err = pcall(function()
		DataStore:UpdateAsync(key, function(oldData)
			
			-- Creates default data if the player is new
            oldData = oldData or {["Current"] = nil, ["Save1"] = {}, ["Save2"] = {}, ["Save3"] = {}}
			
			local newData = deepCopy(oldData)
			
			--[[
				OMITTED SECTION 2: 
		   			* This is where I save all the tables, variables, etc. generated from OMITTED SECTION 1
		   			* I save everything into the table newData and return this table
			]]--
			
			-- Updates local copy of saves
			plrSaves[UserId] = newData

			return newData
		end)
	end)

^ There is actually no point to deep-copying the table lol (in your scenario)

Just do:

	local key = "plr-"..UserId
	local success, err = pcall(function()
		DataStore:UpdateAsync(key, function(oldData)
			
			-- Creates default data if the player is new
            oldData = oldData or {["Current"] = nil, ["Save1"] = {}, ["Save2"] = {}, ["Save3"] = {}}
			
			--[[
				OMITTED SECTION 2: 
		   			* This is where I save all the tables, variables, etc. generated from OMITTED SECTION 1
		   			* I save everything into the table newData and return this table
			]]--
			
			-- Updates local copy of saves
			plrSaves[UserId] = oldData

			return oldData
		end)
	end)
  • There seems to be a few inconsistencies in your code:
  1. Short-circuit evaluation versus If statements:
-- In your Save function
if not oldData then
   oldData = {["Current"] = nil, ["Save1"] = {}, ["Save2"] = {}, ["Save3"] = {}}
end
-- In your Load function
plrSaves[plr.UserId] = savedData or {["Current"] = nil, ["Save1"] = {}, ["Save2"] = {}, ["Save3"] = {}}
  1. The names of your variables
-- In your Save function
local success, err
-- In your Load function 
local succ, err

It’s always good to be consistent throughout your code for readability purposes.

  • Unnecessary indexing:
    In your Save function the only property of the player argument you are utilizing is the UserId. In that case you can just plugin the UserId when calling the function.
local function SaveLoadedPlayers()
	for _, player in ipairs(Players:GetPlayers()) do
		Save(player.UserId)
	end
end
  • Nested function calls:
    In your coroutine.wrap function you are calling a function to call another function hence the wording “Nested”. Instead you can simply do:
coroutine.wrap(Save)(player.UserId)
  • Usage of ipairs vs pairs
    The :GetPlayers() method returns an array and therefore its considered better practice to use ipairs over pairs in this scenario. Pairs should only be used on dictionaries. Autterfly elaborates more here:
    Use ipairs for arrays lol

Aside from these quick fixes, I do admire the architecture of your code tho.
Happy Coding!

1 Like

oh… I forgot to read this :skull:

Thank you very much! I’m mostly looking for advice on how to improve the architecture itself, but I’m always open to hearing criticism and improving my scripting ability.

To address a few of the points you’ve made:

  • It’s true that the assignment of newdata and olddata is a bit redundant. Personally, I’ve strayed away from modifying function parameters, but in this case since the table itself isn’t being copied and only a reference is actually passed into the function, I think it’s better to use oldData directly.
  • The inconsistencies are a fair criticism and something I honestly didn’t even notice. I have a very spontaneous approach to scripting, so I use whatever comes to mind at the moment. I will work on improving my scripts to be more consistent in the future however!
  • Naming is also something I do spontaneously, haha. I could fix up the names of my pcall return variables so they’re more consistent.
  • That’s a good point about unnecessary indexing. The function signatures are carried over from my old implementation for the system where I used the player for other than indexing the UserId. I do think directly passing the UserId is a better approach now that I’ve gotten around to reviewing my code.
  • I’ll rewrite my coroutine to that format. I’ve been used to always creating functions for coroutine from scratch, but this is something I should definitely get in the habit of.
  • Is ipairs necessary when I’m throwing away the index values? I haven’t read enough about the difference between the two, but I will look into that!
1 Like

I apologize. I didn’t even bother reading what you had to say :joy:

Yeah, do not deep copy the original table lol. Idk why I even suggested it, I just thought that was what u were going for.

ipairs returns the exact same values as pairs (the index and value). Pairs works the same exact way as ipairs on an array. A few differences tho. It’s negligibly faster than pairs when traversing through an array.

Differentiating and knowing whether to use ipairs / pairs gives more context to your code showing whether or not you are traversing through an array or dictionary. It’s also no harm to add the extra “i” before pairs lol :wink:

Get Async doesn’t have a cooldown of 6 seconds unlike Set Async, also only should you retry if the pcall wasn’t successful, not when there was no saved data.

You are creating a unnecessary scope when auto saving.

Again, you should retry if the pcall failed but this time and yield the thread for 6 seconds since you would be saving again for the same key.

1 Like

You should be also kicking the player when their data isn’t loading (error getting it). It’s good that you don’t save if they didn’t load, but kicking them warns them a possible datastore problem.

Using profile service would kind of cover everything here, I’m personally working on my own flavor of “ProfileService” with backups, ProfileService needs to auto save to session lock it’s data, it helps you on most cases here.
Profile Service I believe also spreads data out, when it’s bigger than it’s supposed to be? I might be wrong here, but yeah.
I recommend that you do a coroutine for every player, that auto saves their data every 60 seconds, it doesn’t need to be 460 seconds to everybody. That would still throttle. Doing a coroutine to every individual player is not that demendive. It’s better for player data too.

Also, you shouldn’t be doing a BindToClose call on studio, you should just ignore it for the most part. And only doing it for actual servers, I actually didn’t know you were supposed to do a wait(30) for normal servers. I thought it was just Studio which skipped the 30 seconds to close the server.

1 Like

If their data couldn’t be loaded, you should try to load their data from a backup datastore and don’t save their backup data to the actual datastore but to the backup data store because you don’t want to be overriding the players actual data.

Even if their backup data couldn’t be loaded, load the default data but don’t save it.

1 Like

For an error, not for returning nil, if you wanna add backups you can have your own system in place.

If you read my post correctly, that’s what I meant. I don’t see how your post is relevant.

Thanks for all the feedback. This has definitely given me some pointers to improve my code.

@SilentsReplacement
I was not aware that GetAsync() does not have the same cooldown as SetAsync(), but after reading the documentation, you are correct. But wouldn’t it still be ideal to wait some time before successive API calls? Also, I’m only retrying when the GetAsync() fails (until succ or count == 5). But I have to adjust the code to only yield when the call is not successful.

The scope issue might seem unnecessary here because I’ve omitted large parts of the code. I have other processes that run which I’d like to segment clearly to show distinction. The scope itself isn’t necessary, but I like partitioning my code with do end’s when I have multiple blocks of code created to initialize things.

Good point about retrying for UpdateAsync().

@LucasTutoriaisSaimo Great point! I actually didn’t think about kicking them, but that makes sense now and definitely an approach I should be taking.

Also, throttling may not be an issue for me, because my maximum player count is 6. I could do separate coroutines, but I’d rather do simple tasks sequentially rather than in parallel wherever possible. I don’t think 6 successive SetAsync() calls for different keys will cause a huge issue (at least I hope not).

The wait(5) for Studio is so my data gets saved when I exit the game abruptly using the “Stop” button. Otherwise, it technically isn’t necessary. The extra wait(30) is to keep the execution window open for as long as possible before shutdown. I actually have no idea whether BindToClose keeps the server open as long as possible by default even after the connected function returns. If anyone has any further information on this, I’d like to hear.


Also, in regard to the whole conversation about data backups, I’m not sure I want to implement that kind of system from scratch. I suppose I can use DataStore2 or ProfileService, but I feel that the risk of data loss should be low with regular DataStore’s as long as I follow good practice. If anyone can point out any particular vulnerabilities in my code that can be prevented by using DataStore2 or ProfileService, I’d be happy to look into making the switch.

Sure, I get your point.

For your information, data loss happens when Roblox returns nil. Even though there’s data, this is pretty common actually. Errors are detectable, datastores returning nil aren’t. That’s what DS2/OrderedBackups tries to do.

Oh, I see. That’s actually surprising to hear.

I feel like this is a very large issue on Roblox’s part though and something that should be handled on their part. Why would Roblox return nil without erroring if the player has data?

I can understand the inspiration behind DS2 and other similar modules, but this is something that shouldn’t be on the developer to account for.

1 Like

Yeah, the problem I believe is that Roblox hosts their datastore info inside AWS (Amazon Web Services) and everytime AWS is down, it causes problems to Roblox DataStores.

1 Like

Instead of yielding for 30 seconds or more, instead yield based on how many saving jobs are there, whenever you call a function to save data on server shutdown, increment the save jobs variable by 1 and decrease it by 1 if the request to save data went through.

Then, simply yield until the save jobs variable is finally 0. This method is more efficient as well but must come with a citation, only use this method on real game, not on Studio to prevent it from freezing it since there is a possibility that it can yield longer than 30 seconds at which, the server will be forcibly shut-downed.

PS:

Instead of yielding for 2 seconds, a more efficient way would be to create a bindable and fire it once the save request has completed, and then waiting for the bindable to fire.

1 Like

Thanks for the feedback! Will definitely look into implementing these features.