Impossible error - new servers have string.len on table values error thinking the value is not a string

I joined my game on the 19th (today) and saw that my menu wasn’t loading and there was an error in the console. This error is completely impossible, and I’ve done everything I can to verify that it is impossible. I’ve even added debug code after it first started happening to try and see if somehow I’m making a mistake, and I don’t think I am.

In my game this is happening with a module required to load the game’s menu. This module defines a dictionary of words to translated words like this: dict["FESTIVAL"] = "thikliixtot", and provides some functions to access metadata about the dictionary. This entire table is hardcoded into the module, is not exposed, and can never change.

This is the entire ModuleScript: SkiedonicDictionary.lua (60.4 KB) It’s old and weird code but it has worked flawlessly for almost 3 years. It just broke on the 19th around 9:00 PST, which coincides with some flags being enabled related to Luau (e.g. LuauStringFastcall).

The problem is that string.len is erroring because the value passed to it is allegedly not a string, but is instead a table. This is not true, as you can see I have debug prints set up to report if a value in this table is a table. I would expect to see “WHAT” printed in output if this table actually contained a table instead of a string, but I do not see that.

Error

Dictionary module
image

for i,v in pairs(dict) do
	print("Dict", i,v)
	if typeof(v) == "table" then 
		for j,k in pairs(v) do
			print("WHAT ------",j,k) -- What the actual heck
		end
	end
	
	avg = avg + string.len(v)
	dictSize = dictSize + 1
	table.insert(dictArr, v)
end

Where the module is required
image

This was happening here in about 3/50 servers earlier in the day, it seems more common now that hours have passed (new servers have cycled in?). You can tell that the game is broken if you do not load into the menu showing a “Play” button.

To reproduce, you can just join a VIP server. I think that reproduces it every time, or else much more often.

Other important details:

  • This does not reproduce in Studio.
  • If I require this module on the server from the console, it works perfectly fine. This is only broken on the client.
  • This happens to new servers, and seems broken in that server for all players who join. Some players manage to be in-game correctly in these broken servers. Perhaps something broke mid-flight and now only breaks on new joins?

I suspect this might end up magically fixed tomorrow, but I figured I should report it anyway.

12 Likes

Thanks for the report! The offending change has been disabled, will investigate more tomorrow.

7 Likes

I figured I wouldn’t be able to sleep if I didn’t know what this was about so I investigated this - thanks for an awesome repro, this was really simple to figure out with that.

It’s a rare issue that happens when a function uses a lot of string literals and at the same time calls some builtin functions using a fastcall optimization (string.len previously didn’t do that but now it does). This additionally only manifested if the fastcall optimization was dynamically disabled, which can happen when running on a client that has an older version of the VM (which happened here, as we deployed RCC before client), but can also happen when using getfenv or setfenv that disable these optimizations.

Incidentally, @howmanysmaII I think the issue you recently mentioned on Twitter with math.cos is the same problem, more or less.

This should be fixed next week as the fix is very simple.

20 Likes

I did, but I got it figured it out. It’s definitely a getfenv issue. I couldn’t reproduce it, but I did know that I updated my code to use getfenv after realizing it didn’t deoptimize code.

Either way, I’d suggest using # instead of string.len, that might fix it.

getfenv definitely does deoptimize code, which I suspect is why you ran into this problem to begin with.

1 Like

Weird, I wasn’t getting that at all. The only one that really took a nose dive was the one with setfenv.

1 Like

Keep in mind that you actually have to be running expensive Lua code (usually code with loops that run a ton of times each frame) to see a difference. Even if it’s 10x slower… if the script was barely taking any time to run in the first place you won’t notice a difference with the “deoptimized” version.

local ReplicatedStorage = game:GetService("ReplicatedStorage")
local Debug = require(ReplicatedStorage.Debug)

local env = getfenv() setfenv(1, setmetatable({}, {__index = function(_, k) return env[k] end, __newindex = function(_, k, v) env[k] = v end}))

Debug.Warn("!Testing code: %s", script.Name, "setfenv")

local RandomLib = Random.new(tick() % 1 * 1E7)
local function RandomVector()
	local s = 2*(RandomLib:NextNumber()-0.5)
	local t = 6.2831853071796*RandomLib:NextNumber()
	local rx = s
	local m = math.sqrt(1-s*s)
	local ry = m*math.cos(t)
	local rz = m*math.sin(t)
	return Vector3.new(rx,ry,rz)
end

local START_TIME = tick()
for _ = 1, 1E5 do
	local _ = RandomVector() + RandomVector() * RandomVector()
end
local END_TIME = tick() - START_TIME

print(string.format("setfenv script took %d ms", END_TIME * 1000))

This was the code in question.

local env = getfenv()

local RandomLib = Random.new(tick() % 1 * 1E7)
local function RandomVector()
	local s = 2*(RandomLib:NextNumber()-0.5)
	local t = 6.2831853071796*RandomLib:NextNumber()
	local rx = s
	local m = math.sqrt(1-s*s)
	local ry = m*math.cos(t)
	local rz = m*math.sin(t)
	return Vector3.new(rx,ry,rz)
end

while wait(1) do
local START_TIME = tick()
for _ = 1, 1E5 do
	local _ = RandomVector() + RandomVector() * RandomVector()
end
local END_TIME = tick() - START_TIME

print(string.format("script took %d ms", END_TIME * 1000))
end

This runs in ~140ms for me in Studio, and in ~90ms if I comment out getfenv call.

Huh, interesting. Glad I reverted to before using getfenv. Is it just using getfenv at all or is getfenv(2).script.Name safe?

Unfortunately any use of getfenv deoptimizes the environment that it’s called on, so getfenv(2) will not affect the performance of the running code, but may affect performance of code running in the script that’s 2 levels above the call stack.

We’re probably going to introduce smth like debug.getfenv(2, "Script") at some point to help avoid running into this.

5 Likes

I’d love that, that’d be great.

Or maybe a _ENV could be implemented rather. It would be consistent with how newer versions of Lua works.

Of course it potentially could be read only to prevent malicious code.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.