.rbxmx format can't read multiple CDATAs

Usually, when you save a script as a .rbxmx, it’ll look like this: (omitted a lot of unrelated data)

<roblox>
	<Item class="LocalScript">
		<Properties>
			<ProtectedString name="Source"><![CDATA[
			print"a"
			]]></ProtectedString>
		</Properties>
	</Item>
</roblox>

If you have ]]> in your code, it’ll use the old notation, where some chars get escaped:

<ProtectedString name="Source">--]]&gt;</ProtectedString>

Normally, multiple CDATA sections are allowed after each other:
source: https://en.wikipedia.org/wiki/CDATA#CDATA_sections_in_XML

<roblox>
	<Item class="LocalScript">
		<Properties>
			<ProtectedString name="Source"><![CDATA[
			print"a"
			]]><![CDATA[
			print"b"
			]]></ProtectedString>
		</Properties>
	</Item>
</roblox>

But when you try to insert that .rbxmx, it’ll error a red line in the output:
(you can actually use this exact test code, just create a .rbxmx with it and try to insert it in studio)

00:19:24.890 - Unable to open model "C:/test.rbxmx". The file is corrupted.

This is the actual bug ^ (multiple CDATA sections aren’t allowed)

That having multiple in a row is necessary to be able to “escape” ]]>:

print("]]>")

becomes

<ProtectedString name="Source"><![CDATA[
print("]]]]><![CDATA[>")
]]></ProtectedString>

it splits the CDATA section in the middle of ]]> so it doesn’t count as the end tag anymore

TL;DR: The XML format doesn’t support two (or more) CDATA sections in a row

Why use CDATA over the old notation?

Eh, just feels weird that this doesn’t work. I’ve also been playing around with storing .rbxmx files in git repositories. I made a packager (with a git precommit hook) that “builds” a .rbxmx containing the scripts in the repository (regular filesystem structure with files named Server.server.lua etc). I would’ve preferred to use CDATA, as it’s much easier to git diff, as nothing is escaped (except ]]>, which is rare in roblox/Lua) and it’s just cleaner.

That roblox uses the old format when Saving to file when it contains ]]>, that’s fine. It’d just be nice if roblox could at least read files that use multiple CDATA sections in a row, like how it’s supposed to work.

that’s about it

3 Likes

Having written an implementation of the rbx(lm)x format, I can say with confidence that the format is not XML. At most, it only looks like XML.

Having extensively probed the studio’s parser for this format, I can tell you that it is completely nonsensical. Its source is probably a Big Scary Thing that no engineer wants to touch. The only attempts to improve it will likely involve completely rewriting it using a standard XML parser.

There is no formal specification of the format, and the Roblox studio/client contains the only official implementation that reads and writes it. Therefore, the studio/client indirectly specifies the format. A consequence of this is that there are no bugs with the format as long as the studio/client can read any file that any current or previous version of the studio/client has ever written in the format.

Third-party implementations that write files which cannot be read by the current studio/client have technically implemented the format incorrectly. This is how it must be until Roblox decides that they’re willing to support a proper specification.

6 Likes