Messaging service is down (Retry after -2147483648 seconds.)

That’s strange, it’s returning a MAX_INT inside of a negative value. It would be logical to be a positive value since it says to wait for another retry also the number should be lower than a maximum possible.

Does anyone have a sort of solution to this, as it looks like we’ve been left in the grey?

Right now I’ve got it hooked to cycle through 4 different topic names if one breaks, however even in this case it all still breaks eventually after a few hours.

2 Likes

Unfortunately this is the best I’ve come up with as well.

My next approach is to implement an automatically incrementing pair of topics with initial values of 1 and 2:

When one fails, update a datastore value that holds the value to concatenate to the end of the topic (the main topic “mytopic_1” becomes the backup “mytopic_2” and the backup now becomes 3.) So when “mytopic_1” fails, the server publishing will increment the datastore counter, and automatically publish to the backup, “mytopic_2”, which in theory should still be working fine, and the subscribing servers should already be listening to. The subscribing servers should soon find out that the counter has been incremented and to unsubscribe from “mytopic_1” and to subscribe to the new backup of “mytopic_3”. Best case scenario is that the the primary and backup topics don’t corrupt at the same time, and there won’t be an interruption of service on the receiving end. This process will repeat when the primary topic fails again.

However I have a feeling this might not work, because from my experience with this issue when one topic fails, the next publish (even if it’s a different topic) will sometimes fail as well.

I’m torn between continuing to band-aid this and just ditching MessagingService altogether until it actually gets fixed.

2 Likes

Yea MessagingService is pretty unreliable, I wouldn’t trust it with player data. When cloud scripts and shared memory come out, will be lit.

As a follow-up, I implemented the fix I laid out above and so far it’s working. There have been cases where the next topic would immediately break, but the topic after that seems to be stable.

I’ve also been keeping track of when it has to increment. some stats:
image
(Note: Topic 5 is still active, also I didn’t keep track of the seconds)

3 Likes

This bug still occurs ): I am currently using MessagingService in my game to alert other servers that a player has been banned (so that they can’t just simply join another server to dodge a ban temporarily while the DataStore is outdated). I’d use DataStore:OnUpdate but it’s been been depreciated ):

And just to clarify from other posts I’ve read, no I am not anywhere near the data limit (I’m only sending a UserId) nor am I near the publish request limit (it only sends a publish when someone is banned, which is very rare).

I can look at work-arounds for now, but is there any update on this getting fixed?

1 Like

You can use the time to fix this temporarily, just get os.time() remove first 100 or 1000 digits and put that in publish async and subscribe async, its a hacky solution but it has to work.

Thanks for the report! We’ve filed this internally and we’ll follow up here when we have an update for you.

5 Likes

Hey folks, if you happen to know the rate that you call publishAsync for a given topic, ie. 20/s, could you please reply to this post?

For one topic its once every 5 ish minutes and another is once every 3 minutes.

For my server reporter it publishes once every 15 seconds per server.

@Garnold @Stratiz thanks for that info. I’ll post any additional updates to this thread. Thanks!

A better (very reliable) solution (without datastores) is to use a set time interval (e.g. 1 minute):
(This is a fancier version of what @takticiadam mentioned)

local MessagingService = game:GetService("MessagingService")
local topicName = "MatchMaking" -- Example

local SECOND = 1
local MINUTE = 60 * SECOND
local PERIOD = MINUTE
local EARLY_UPDATE = PERIOD * 0.1 -- A number of seconds to subscribe to a topic before updating the outgoing topic, as well as how long after that before unsubscribing from the now unused one
local interval = math.floor(os.time() / PERIOD)

local function callback(message)
	-- Something
end

local topic
local subscription
while true do
	interval = math.floor((os.time() + EARLY_UPDATE) / PERIOD)
	local curTopic = topicName.."_"..interval
	local oldSubscription = subscription
	
	subscription = MessagingService:Subscribe(curTopic, callback)
	
	wait(EARLY_UPDATE)
	topic = curTopic -- New topic now in effect
	wait(EARLY_UPDATE)
	-- Unsubscribe from the old topic
	if oldSubscription then
		oldSubscription:Disconnect()
	end
	wait(PERIOD - EARLY_UPDATE * 2)
end

-- Example of publishing messages:
MessagingService:Publish(topic, message)

Effectively this would “phase” topics over a set time period, resubscribing just before that period passes, and unsubscribing from the old one just after. That way no messages are potentially lost in the transition due to a bit of network delay, and the topic is constantly fresh in case limits are improperly tracked or something.

2 Likes

So I have a small update for y’all - we’ve pushed a small fix to the backend that should help with this issue a bit. It’s not a complete fix, which we’re investigating further, but this particular event should happen much less frequently than how often it is occurring now.
There’s some requirements in reproducing this bug, ie. you need a good number of players and multiple servers for your game, and then when you hit a certain threshold, the topic is locked out until you start a new one, so technically, some of the workarounds listed in this thread would certainly work to get around this threshold - specifically changing the subscription topic. Ideally, what we’re aiming for is that you shouldn’t have to use that workaround, but if you already have it in your game, there’s not a need to remove it as it’s fairly harmless in the big picture.
For some context, the “Retry after -2147483648 seconds.” message is returned whenever the system believes you’ll never have the allotment to send a message using the system. Typically it should give you an actual time, but the bug was preventing it from realizing that.
I appreciate all of your feedback and input on this matter, and I will continue looking into a long-term solve for y’all. Please post any updates as well.

Thanks.

6 Likes

This issue should now be resolved! If this issue is still occurring, please create a new topic for us to look into.

8 Likes

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.