Datastore complains string is too large, but it actually isn't

I’ve run into the problem of Lua telling me a string is < 260,000 (the limit on a single Datastore entry) with Datastores telling me it’s over the limit. I’ve isolated the issue to this code (run it yourself and see, it will always produce this output):

local HttpService = game:GetService("HttpService")
local DataStore = game:GetService("DataStoreService")

local link = "https://pastebin.com/raw/wyAFPiEb" -- A link to some output
local stuff = HttpService:GetAsync(link)
-- Checking for multiple byte UTF-8 characters. This is known to inflate
-- the size of the Datastores. The below code shows there are only single byte
-- characters in the output. 
for extendedCharacter in string.gmatch(stuff,"[\128-\191]") do
	print("Character has more than 1 byte!") -- This will never print
end

print("Done")
-- Print the total number of bytes in the string.
print("Total string length was: "..#stuff) -- Total string length was: 249574

local datastore = DataStore:GetGlobalDataStore("Test")
datastore:SetAsync("Test",stuff) -- 105: Serialized value exceeds 260,000 character limit.

As you can see above, I’m checking to see if there are any multiple bytes for UTF-8 characters. It’s extremely unlikely this is the issue, since I’m really only storing English strings that are printable on a standard keyboard. I figured I’d at least show I considered it. :man_shrugging:

The above output was generated using JSONEncode.

Does anybody have any ideas on what may be causing this inflation?

You’re storing a table.

I’m not storing a table. I’m retrieving a string that was generated by JSONEncode on a table.

Character count in string cannot exceed 65,536 characters.
Source: Data Stores

I may be misreading this but on the wiki it specifically states:

Character count in string cannot exceed 65,536 characters.

Yet underneath it also states that the data can not exceed the length you provided, which is contradictory and confusing.

With 249,574 characters, you are likely getting pretty close. I think the limit is about ~260,000 normal characters. Special characters might take up more space. You could try decreasing the size by removing special characters and see what happens.

It’s likely that the wiki means normal characters and not also special ones. Why do you need so many characters?

Also stated, it says strings cannot be longer than 65,536 characters. It may be that tables can be stored longer, and strings cannot. Likely, Lua is unsure what to do with that much data.

I think the limit is about ~260,000 normal characters. Special characters might take up more space.

Could you please explain what you mean by “special characters”? If you mean multiple byte UTF-8 characters (any byte values not <= 127) then the above check should verify it’s only single byte characters.

Also stated, it says strings cannot be longer than 65,536 characters. It may be that tables can be stored longer, and strings cannot. Likely, Lua is unsure what to do with that much data.

I read that and have no idea what it’s getting at, since as @CodeNinja16 pointed out it immediately contradicts itself. If you really want confirmation that line of reasoning is incorrect, try running this:

local Datastores = game:GetService("DataStoreService")
local datastore = Datastores:GetGlobalDataStore("Test")

 -- Create a string of single-byte character "t" 260000 characters long.
local str = string.rep("t",260000)
datastore:SetAsync("Test",str) -- This will always be successful

Maybe you want to look away from the error itself and how you’re saving data instead. What data are you explicitly saving that could possibly reach the maximum character count for the DataStore?

I’m creating an engine that plays back server rounds. That requires quite a hefty file size. I’m currently splitting the components up into multiple Datastore entries based on if they exceed the maximum size, and using a header file for telling how many components there are to retrieve. The information is mostly keyframe information. So the stripped down version of the Datastore entries would look something like this:

-Header = {Keyframes = 2}
-Keyframe1 = {...}
-Keyframe2 = {...}

For something like playback, there’s really no way of shrinking the size down so it could fit within 260,000 characters, even with good compression.

Storing the decoded format seems to work without any errors.

local HttpService = game:GetService("HttpService")
local DataStore = game:GetService("DataStoreService")

local Link = "https://pastebin.com/raw/wyAFPiEb"
local Stuff = HttpService:JSONDecode(HttpService:GetAsync(Link))

local DataStore = DataStore:GetGlobalDataStore("Test")
DataStore:SetAsync("Test", Stuff)
2 Likes

Odd. I forget where I read it, but I had always been under the impression SetAsync automatically called JSONEncode on any table it received.

This somehow seems to work. Thank you everyone for your help!

Emphasis on the “somehow”, I guess there was no point in that if not to take some experience away.

The data size was decreased dramatically by encoding as a table, because you cut out all of the padding and syntax characters that JSON uses, which you said yourself, use more space due to being special characters.

The limit on size for tables is most likely larger for that reason.

Moral of the story: Encoding data in a table for data stores is generally a good idea, especially if you have multiple data sets, or particularly large data. Always do this unless you have a reason not to.

1 Like

I was going off what the Roblox wiki says. It says that strings have a 65,536 character limit, I thought the wiki would be accurate. It says that datastores have a 260,000 character limit.

I’m sorry but I think you’re wrong.

The data size was decreased dramatically by encoding as a table, because you cut out all of the padding and syntax characters that JSON uses

But I was manually checking the string length of the actual encoded string. It was below the limit.

which you said yourself, use more space due to being special characters.

JSON uses standard characters like colon, comma, quotation marks, etc. These are not multi-byte characters and therefore do not take up extra space.

The limit on size for tables is most likely larger for that reason.

I’ve never seen any recorded limit on tables. I think what you’re referring to is this seeming contradiction:


Besides, I’d gotten far beyond 65,536 characters before with my JSON encoded strings.

As a side note, I just noticed this:

So taking into account this and that I see no recorded limit on tables, and only strings, I’m gonna have to say it does internally convert to strings, but somehow does a better job with it than JSONEncode does?