Before anybody panics: Studio reads these files just fine, despite them being wrong XML. I will make a second bug report for reading if requested.
Roblox’s XML format escapes UTF-8 text in string properties under all circumstances. As an example, it would save a StringValue with its Value set to ☺
as follows:
<roblox xmlns:xmime="http://www.w3.org/2005/05/xmlmime" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="http://www.roblox.com/roblox.xsd" version="4">
<Meta name="ExplicitAutoJoints">true</Meta>
<External>null</External>
<External>nil</External>
<Item class="StringValue" referent="RBXF3D08260A7644806A9C1DF20B3C26425">
<Properties>
<BinaryString name="AttributesSerialize"></BinaryString>
<SecurityCapabilities name="Capabilities">0</SecurityCapabilities>
<bool name="DefinesCapabilities">false</bool>
<string name="Name">Value</string>
<int64 name="SourceAssetId">-1</int64>
<BinaryString name="Tags"></BinaryString>
<string name="Value">☺</string>
</Properties>
</Item>
</roblox>
Note that the Value
property become ☺
, which is seemingly an escape of the three bytes that ☺
is under the hood (you can see them here). This would be fine, except that is not how it works.
The sequence ☺
is a sequence of what are called character references, and as per the XML standard they expand by codepoint. This would cause the above sequence to expand to the sequence \u{e2}\u{98}\u{ba}
which in turn expands to c3 a2 c2 98 c2 ba
. That can be visualized here:
Expected behavior
Every XML parser I have tried causes this to expand as the standard says it should. My expectation would be that Roblox either escaped unicode characters properly (in the above example, you would escape ☺
as ☺
) or simply not escape them at all since XML supports UTF-8 encoding.