Announcing DataStore v2.0 - Automatic Versioning, Data Tagging, & Listing!

Another question: What is the point of setting Metadata? Can anyone explain that to me? Why shouldn’t I just store the metadata as normal data, where I have more memory to work with? Is there an advantage of storing it in such a way?

3 Likes

Supporting migration code can be a pain point for long-term projects. I have hundreds of lines of legacy deserialization code that I still need to upkeep; Most of it is over 5 years old.

I felt clever implementing an extremely compact save format for inventory data that encodes using the inventory’s length, but now I need to pass the length of the legacy global inventory when deserializing characters so that I can properly skip the old equipment data.

I’m okay with keeping old code like this around for a while, years even, but it’s a huge relief to know that I don’t need to support it forever.

13 Likes

Because in the future we’ll be getting features to query by properties inside metadata. You’re fine doing it inside your normal data partition, meta data should be for small things.

5 Likes

I don’t think that’s the best option, but it has to consider what kind of script it is. As roblox continues to update things, older scripts end up not working or some other reason. I had a game from 2015 and everything doesn’t work anymore, but before it did. Just update the script occasionally just to keep it working.

Off note: I am a huge fan of your game, keep up the great development! Nice meeting you, Tomarty!

4 Likes

What benefits does this have vs DataStore 2 made by @Kampfkarren . I made the switch to DataStore 2 last Saturday in my game, and do not plan to switch back unless there is a good reason for me to use this. I can also access versions just fine with DataStore2, and it says there forever and not just temporarily.

2 Likes

Is there an ETA on binary string support for DataStores? string.pack was added, but we still can’t use packed strings for saves without expensive base64 / ascii85 conversions. JSON is great for beginners, but it’s unnecessarily slow, and is honestly horrible to have set as a default; DataStores only ever needed to support strings, because developers can easily use JSONEncode / JSONDecode. I have the same criticism for MessagingService.

I’m hoping to switch to using binary strings for save data someday (instead of bit buffers), so that I can efficiently skip through data and find specific information without needing to deserialize the entire thing. If I need to redirect them to another server, I could just skip to their active character’s data, read its current zone, and send them on their way. Players can have potentially hundreds of individual characters with stats, items, quest progress, perks, etc., so deserializing between intermediate formats can take milliseconds of execution time from an already busy server. With a packed string format it’s very fast.

I’m also planning a house system with furniture customization, and string.pack is the obvious solution for storing these object. “{X:-61.20931480293841,Y:-13.1213440189129,Z:21.10298308140293}” has horrible efficiency compared to string.pack("fff", x, y, z) which uses only 12 bytes. Compact formats like this will reduce network overheads, reduce server overheads, and reduce data storage costs. It should be supported properly.

26 Likes

Honestly one of the best updates we’ve had in awhile, all of these features are gonna make developing much easier.

3 Likes

What is the advantage of having metadata tags over simply saving a table with said info in?

4 Likes

give us the power to make actual SQL queries

9 Likes

Is there any reason to use data store2 now?

1 Like

Answer is simple, it will give you ability to query and index based on attributes in the future.

1 Like

Great improvements, I’m working on a new game and this will be very helpful!

3 Likes

Will this be automated by Roblox, or will we still be required to use our code to do the system.

1 Like

Roblox expects you use the new 2.0 to save and to delete user data upon request. Not automated.

1 Like

I thought it would be automated since there is literally a way to attach a UserId to a key.

1 Like

Right now, listing keys only allow you set the prefix of the keys, we’re considering to support sorted listing so that you can select a range of keys with ascending or descending order. This can help you build in-game search based on the key structure.

4 Likes

ListKeysAsync Feedback/Notes

So I’ve been able to play around with ListKeysAsync at the very least and I took a few notes, some things of which I find confusing. I understand that this is a beta so there’s probably more features on the way and a reason why things were designed the way they were, but I can’t help but point 'em out.

  1. What’s the design decision behind using instances for DataStoreOptions? This same question I had for TeleportOptions but I don’t know if the latter was explained and I’ve never really used TeleportOptions before because I didn’t feel the need to switch over to that method.

  2. What’s the design decision behind making a DataStoreKey instance whose only purpose is to hold the key’s name? Why not add the key directly to each page of a DataStoreKeyPages? Are there plans to add more properties/methods? If not, this just seems like API bloat. In my honest opinion, a lot of the newly added API seem like bloat unless there’s plans to use those to introduce more properties/methods, but it still begs the question of the decision to make them instances and not something else (datatypes, more properties/methods, dictionary indices, etc).

  3. I was hoping that ListKeysAsync would also return the key’s value as well but then that wouldn’t accurately fit the in-the-name method implying that it lists keys. Could we, at any point, expect to be able to traverse an entire DataStore’s keys and values rather than need to list the keys and then double up with a getter or setter to perform operations? If it conflicts with versioning then we should be able to specify which version we want to list from.

    A. I should mention that purely listing keys is good enough for a few of my use cases, namely caching a list of taken names for a guild feature in an RPG to prevent two guilds from using the same name. It wouldn’t cause a confliction because I can easily id each guild but for user experience’s sake no two guilds should share the same name. I thus don’t need to know the value of those keys. Therefore, my question here is more of a “potential use cases” than a “I need this”.

As for specifying UserIds for GDPR compliance, not sure how one would use that - are we expected to traverse our entire DataStores every time there’s GDPR deletion requests, then use GetAsync, find out if the UserId is returned in GetUserIds and then remove it accordingly? That is mad impractical considering how we don’t have many requests to work with. Could an example be provided of how to use new DataStore APIs for GDPR compliance?


I originally dedicated this section for feedback/notes on ListDataStoresAsync but it turns out there was an issue with the test I ran. I’ve forwarded this feedback however via a follow up just to check if the behaviour was intended or not.

See my follow up for an explanation as to how I resolved the issues I was facing and why I’m archiving this section of my feedback. Just for reference purposes.

Archive

ListDataStoresAsync Feedback/Notes

EDIT (3:35 PM EDT): Additionally, ListDataStoresAsync seems to default the prefix to “0” whenever it’s not specified? I tried getting every single DataStore that exists in my experience but only DataStores being prefixed with “0” were returned. I know I have far more DataStores than that.

Repro code
--- ListDataStores test
-- @author colbert2677

local LIST_PREFIX = nil
local PAGE_SIZE = 100

-- @see https://developer.roblox.com/en-us/api-reference/class/Pages
local function iterPageItems(pages)
	return coroutine.wrap(function()
		local pagenum = 1
		while true do
			for _, item in ipairs(pages:GetCurrentPage()) do
				coroutine.yield(item, pagenum)
			end
			if pages.IsFinished then
				break
			end
			pages:AdvanceToNextPageAsync()
			pagenum = pagenum + 1
		end
	end)
end

local storePages = game:GetService("DataStoreService"):ListDataStoresAsync(LIST_PREFIX, PAGE_SIZE)

for item, pageNo in iterPageItems(storePages) do
	print(item.DataStoreName)
end

This only prints DataStores of mine that start with 0 as a character, since I do have some DataStores that use 0 at the start of their names. When I change the prefix to 1 however it shows more DataStores (but in this case, again, only stores starting with “1”), meaning that ListDataStoresAsync isn’t doing what it should or rather it’s behaving unexpectedly out of touch with documentation.

EDIT (3:38 PM EDT): I’ve had to manually traverse my DataStores with a workaround: more for loops for every alphanumeric character. Roblox forbid I have any unpredictable cases where a DataStore does not start with an A-Z character, either upper or lowercase. Even that was a pain because it threw a 502, so I had to run this code in chunks.

Repro code
--- ListDataStores test
-- @author colbert2677

local DataStoreService = game:GetService("DataStoreService")

local PAGE_SIZE = 100

-- @see https://developer.roblox.com/en-us/api-reference/class/Pages
local function iterPageItems(pages)
	return coroutine.wrap(function()
		local pagenum = 1
		while true do
			for _, item in ipairs(pages:GetCurrentPage()) do
				coroutine.yield(item, pagenum)
			end
			if pages.IsFinished then
				break
			end
			pages:AdvanceToNextPageAsync()
			pagenum = pagenum + 1
		end
	end)
end

local dataStoreNames = {}
local startClock = os.clock()

table.insert(dataStoreNames, "--- NUMERIC ---")

-- Start by iterating numeric DataStore prefixes
for i = 0, 9 do
	local storePages = DataStoreService:ListDataStoresAsync(tostring(i), PAGE_SIZE)
	for item, _ in iterPageItems(storePages) do
		table.insert(dataStoreNames, item.DataStoreName)
	end
end

table.insert(dataStoreNames, "--- CAPITAL LETTER FIRST ---")

-- Next, iterate all uppercase DataStore prefixes
for i = 65, 90 do
	local storePages = DataStoreService:ListDataStoresAsync(string.char(i), PAGE_SIZE)
	for item, _ in iterPageItems(storePages) do
		table.insert(dataStoreNames, item.DataStoreName)
	end
end

table.insert(dataStoreNames, "--- LOWERCASE LETTER FIRST ---")

for i = 97, 122 do
	local storePages = DataStoreService:ListDataStoresAsync(string.char(i), PAGE_SIZE)
	for item, _ in iterPageItems(storePages) do
		table.insert(dataStoreNames, item.DataStoreName)
	end
end

table.insert(dataStoreNames, "--- END ---")

print(dataStoreNames)
print(string.format("Took %f to list all alphanumeric DataStore prefixes", os.clock() - startClock))

At this point, prefixing is looking more like a requirement than an option and that’s egregious for any experience that has already stored millions of keys worth of data. Not looking too hot the more I dive into it. Might be a lesson to myself to test before hypeposting ever again. It’s a beta, I know, but somehow I find myself unable to trust that things will work the way I expect. Sigh.

EDIT (3:51 PM EDT): Speed isn’t too bad since I’m using this in Studio. Took 0.780010 seconds to get through 26 DataStores that presently exist in my experience which are prefixed by numbers.

EDIT (3:57 PM EDT): Ditto for my DataStores prefixed with capital letters. 2.131966 seconds for 107 DataStores. Speed isn’t bad, it’s the fact that I need to use such absurd workarounds to make ListDataStoresAsync work properly that’s somewhat offputting.

EDIT (4:00 PM EDT): At some point ListDataStoresAsync starts failing and I have no idea what kind of budget I’m working with or what the explicit rejection was. Naturally, I’m calling ListDataStoresAsync so many times it makes sense for me to trip an error (60 times total for alphanumeric prefixes; 10 for numbers, 25 for uppercase letters and 25 for lowercase letters), but by goodness if you really want to get every single DataStore in your experience you’re going nowhere with that. ListDataStoresAsync.

I will not endorse developers to start using prefixes for new work going forward but I will also not encourage foregoing one. Think about your use case carefully and if you actually need it. Most of your cases will be for development, not production-level work. Do not take advantage of the fact that these functions exist because you might run into similar issues. Keep consolidating as much as possible.

EDIT (4:07 PM EDT): Just had a thought: you may actually have a case where you need to do what I’m doing. The only production-level use case I can think of off the top of my head is GDPR compliance. You would need to search every single DataStore, then every single key of that DataStore, then GetAsync those keys and check if anything can remotely resemble the user requesting deletion, then delete it. See how many hoops there are to jump through? Not to mention that even the major budget allowance Studio has to make changes to DataStores wouldn’t be enough to commit to such an operation like that. Not entirely practical.

EDIT (4:12 PM EDT): “Final” code sample, just in case it hasn’t been consistent throughout my posts. This is what I’m working with up until now. May have to space out every collection operation by 30-60 seconds or run them independently across a generous period of time.

Open here
--- ListDataStores test
-- @author colbert2677

local DataStoreService = game:GetService("DataStoreService")

local PAGE_SIZE = 100

-- @see https://developer.roblox.com/en-us/api-reference/class/Pages
local function iterPageItems(pages)
	return coroutine.wrap(function()
		local pagenum = 1
		while true do
			for _, item in ipairs(pages:GetCurrentPage()) do
				coroutine.yield(item, pagenum)
			end
			if pages.IsFinished then
				break
			end
			pages:AdvanceToNextPageAsync()
			pagenum = pagenum + 1
		end
	end)
end

local dataStoreNames = {}
local startClock = os.clock()

table.insert(dataStoreNames, "--- NUMERIC ---")

for i = 0, 9 do
	local storePages = DataStoreService:ListDataStoresAsync(tostring(i), PAGE_SIZE)
	for item, _ in iterPageItems(storePages) do
		table.insert(dataStoreNames, item.DataStoreName)
	end
end

table.insert(dataStoreNames, "--- CAPITAL LETTER FIRST ---")

for i = 65, 90 do
	local storePages = DataStoreService:ListDataStoresAsync(string.char(i), PAGE_SIZE)
	for item, _ in iterPageItems(storePages) do
		table.insert(dataStoreNames, item.DataStoreName)
	end
end

table.insert(dataStoreNames, "--- LOWERCASE LETTER FIRST ---")

for i = 97, 122 do
	local storePages = DataStoreService:ListDataStoresAsync(string.char(i), PAGE_SIZE)
	for item, _ in iterPageItems(storePages) do
		table.insert(dataStoreNames, item.DataStoreName)
	end
end

table.insert(dataStoreNames, "--- END ---")

print(#dataStoreNames, dataStoreNames)
print(string.format("Took %f to list all DataStores", os.clock() - startClock))

EDIT (4:15 PM EDT): Lowercase letters finally started being responsive. 2.430447 seconds for a whopping 0 DataStores. What…?

EDIT (4:23 PM EDT): Decided to throw in a task.wait(60) between each collection. Including the wait time, 124.870384 seconds to retrieve 135 DataStores prefixed by alphanumeric characters. This is as close to a production-level code sample as I can get that reasonably waits until the engine issues more budget for the current minute. Feel free to reference.

Open here
--- ListDataStores test
-- @author colbert2677

local DataStoreService = game:GetService("DataStoreService")

local PAGE_SIZE = 100

-- @see https://developer.roblox.com/en-us/api-reference/class/Pages
local function iterPageItems(pages)
	return coroutine.wrap(function()
		local pagenum = 1
		while true do
			for _, item in ipairs(pages:GetCurrentPage()) do
				coroutine.yield(item, pagenum)
			end
			if pages.IsFinished then
				break
			end
			pages:AdvanceToNextPageAsync()
			pagenum = pagenum + 1
		end
	end)
end

local dataStoreNames = {}
local startClock = os.clock()

table.insert(dataStoreNames, "--- NUMERIC ---")

for i = 0, 9 do
	local storePages = DataStoreService:ListDataStoresAsync(tostring(i), PAGE_SIZE)
	for item, _ in iterPageItems(storePages) do
		table.insert(dataStoreNames, item.DataStoreName)
	end
end

task.wait(60)

table.insert(dataStoreNames, "--- CAPITAL LETTER FIRST ---")

for i = 65, 90 do
	local storePages = DataStoreService:ListDataStoresAsync(string.char(i), PAGE_SIZE)
	for item, _ in iterPageItems(storePages) do
		table.insert(dataStoreNames, item.DataStoreName)
	end
end

task.wait(60)

table.insert(dataStoreNames, "--- LOWERCASE LETTER FIRST ---")

for i = 97, 122 do
	local storePages = DataStoreService:ListDataStoresAsync(string.char(i), PAGE_SIZE)
	for item, _ in iterPageItems(storePages) do
		table.insert(dataStoreNames, item.DataStoreName)
	end
end

table.insert(dataStoreNames, "--- END ---")

print(#dataStoreNames, dataStoreNames)
print(string.format("Took %f to list all DataStores", os.clock() - startClock))

EDIT (6:50 PM EST): I think it might be wiser to write my own iterator function for this. A huge dependency of my code right now is the page iterator function on the Developer Hub. Followed up on this and got another code sample that successfully iterates them all without needing to manually prefix.


I will not be able to gather any testing data on ListKeysAsync. I have an extremely good data set to work with but it’s my live data and the Developer Hub encourages not using production data to test the new features so I’ll follow that recommendation strictly. I’m not taking chances with Roblox.

14 Likes

Is the userIds field guaranteed to not be used for automatic deletion in the future?
e.g. If you have a system where people can list users they want to allow into their house, and one of the allowed users requested erasure, Roblox would never wipe the whole record?

1 Like

I would prefer that either it will be automated or that we will have to option to make it automatically erased. I don’t see a reason for it to not be forced since to me that seems to be the whole purpose of the feature - to automate the erasure process.

Giving developers the option to not erase data that was requested by the user to be deleted doesn’t make sense to me.

1 Like

If you have a table storing profile information on Bob, and Bob adds Alice onto a permissions list associated with his profile, the record would/may be tagged with both Bob and Alice in the array. If Alice made a request for erasure, and it automatically wiped all records tagged for her - this would end up wiping Bob’s profile - which is something obviously I would want to avoid.

Really I would want some routine which is called when a erasure request is made so I can just modify the permission list to remove Alice from Bob’s permission list.

I suppose we don’t know if/or any implementation details yet, but maybe it will be something covered by the “Cloud Scripts” on the roadmap as sounds like the sort of thing a Cloud Script should handle?

3 Likes