SubscribeAsync gives seemingly random values as the message for the callback

This has been happening for quite some time now, and we’ve made an effort to start tracking it in our game Adventure Up! since about two weeks ago. Ever since then, we have gotten strange values in the callback for SubscribeAsync, mostly nil or Instance values:

During the last 24 hours we even had a couple times where the value was a number/userdata:

Here’s the exact code we are using:

messagingService:SubscribeAsync(topic, function(message)
	if type(message) ~= "table" then
		warn("Messaging | SubscribeAsync message was not a table, was " .. typeof(message))
		return
	end
	-- etc
end)

We don’t know how much this is affecting actual users, it doesn’t happen often enough for us to know. Something worth noting is that we have also had lots of these errors logged:

EDIT: This is definitely happening more today than before.
The amount of nil messages is rising quite a bit (will probably reach 400 overnight) which is worrying.

4 Likes

Can you post related code that is sending messages?

8 Likes

This is the function that every single PublishAsync request is sent through:

local messageMaxTries = 3
local function messageSend(topic, data, tries)
	assert(type(topic) == "string", "Argument #1 must be a string")
	assert(#topic > 1, "Argument #1 must be a string more than 1 characters long")
	assert(#topic < 64, "Argument #1 must be a string less than 64 characters long")
	assert(type(data) == "table", "Argument #2 must be a table")
	local tries = tries or 1
	local success, result = pcall(function()
		messagingService:PublishAsync(topic, data)
	end)
	if not success then
		if tries <= messageMaxTries then
			wait(5)
			return messageSend(topic, data, tries + 1)
		end
	end
	return success, result
end

Every request sent is piped through this method, and as you can see all the inputs are validated beforehand (with some extra limits on topic)

It’s… getting weirder. This is from today:

The total amount of messages not being the proper type is now also over 1000.

Looking into it! Do you know of other games that this is happening to, or it’s just Adventure Up?

1 Like

I’m not aware of any other games that have this issue, but I haven’t talked to anyone either.
It’s definitely happening more now than before, especially for nil messages, with about 200 of these errors logged per day vs ~10 at the start of the month.
Here are our total error counts for this issue in Adventure Up right now:

1 Like

I’m a bit curious what kind of data you’re trying to publish. I’m wondering whether it’s possible that by publishing a table, it’s serialized weirdly by the Lua interpreter. I’ll continue to poke around in the meantime to see if I can confirm this behavior. In the meantime, can you confirm that this is still happening?

1 Like

It is definitely still happening, and at the same rate as before. It’s staying pretty consistent now.

Here are the two types of data we’re sending through MessagingService, with the first type of table making up about 90% or more of all the data sent:

{
    Name    = string,
    UserId  = number,
    PlaceId = number,
    JobId   = string,
}
{
	SenderId      = number,
	SenderName    = string,
	RecipientId   = number,
	RecipientName = string,
	MessageId     = string,
	Message       = string,
}

These are used for live in-game friend status updates as well as cross-server chat messages respectively. Hope that helps!

3 Likes

Just posting here to have a public record of this - I’ve communicated with filiptibell and we have not been able to reproduce this issue internally. We will confirm that this is indeed unintended behavior, but given the narrow scope of this and relative infrequency, we will be shelving this problem for now.

That said, if filiptibell or any developer can provide additional information (such as exactly what input causes this issue and whether it does so deterministically) then we are happy to reopen and re-investigate this issue.

2 Likes

Hello! I can confirm we are definitely seeing the same thing. Arbitrary objects seem to passed in on the callback for on a subscription. This is not happening every day but happens in spikes at a time when it occurs.

It’s erroring when trying to index the data table, where what is passed in is not a table, but some arbitrary instance.

1 Like

I see. Does this correlate with spikes in players or messages sent? I’m wondering if it has anything to do with the “Too many subscriptions” issue from your other thread.

2 Likes

This, I am not certain of. Let me analyze my data the times where those errors occurred to see if I can find any patterns. I’ll get back to you.

Hey there,

Do to you mean a spike in players in the server or globally? Definitely doesn’t look like it correlates to a spike in players globally as the most recent spike of these errors occured at a low traffic time. However, it doesn’t happen across all servers, only one server when it does happen.

Here is a screenshot of the 50 of these errors occured at exactly 2:02:49am UTC time on March 14th.

And then before then, the last set of errors were on March 9th, so 5 days previously.

So, these errors are pretty rare, but nonetheless still strange. The issue where SubscribeAsync calls don’t get disconnected seems to be far more widespread and have much more of a severe impact, such as leaking memory.

I can chart the number of errors where the subscribeasync calls seem to not get disconnected (invoking the error in my other thread). I will update the other thread shortly.

Now this is a shot in the dark, is there any remotes that could allow the client a direct access to a function or a remote that fires MessagingService but doesn’t use the function you use. Just to be safe I would recommend you do a quick scan of the game with CTRL + SHIFT + F for MessagingService. Who knows it could be a issue with MessaingService/Lua Interpreter or it can be caused from something else stupid. I’ve had my fair share of issues where I thought it was a roblox problem when it really was just my problem.

This is a very strange bug and I’ve honestly never seen anything like it. It appears that perhaps somehow the reference to the table being passed through the function is managing to reference some other lua value in memory? It might make sense considering some information above, notably the logs containing many attachments. Based on the logs above I am wondering if this could also be timing based.

When a server encounters this error, what happens if you store, let’s say 10000 userdatas using newproxy into an array+hashmap (values to indexes, indexes to values), then in the callback check for the message in the hashmap? Are any userdatas leaked? Do they appear in any certain pattern if any appear? What kinds of pointer values are you seeing with tostring if they are appearing? Are they always increasing in value?

If nothing happens then, what happens if, let’s say you do the same thing, however you connect on some frequently running events, for example, heartbeat+stepped, loop over the table with some small random yields, then do the same thing? Are values leaked more or less frequently, if at all?

Something that makes this just that little bit stranger is that this starts happening on one isolated server, seemingly at random.

3 Likes