Port unicode and hex string escape formats from Lua 5.2/5.3

As a Roblox developer, it is currently too hard to insert unicode characters into strings dynamically. This is because although Roblox has ported the utf8 library from Lua 5.3, the accompanying string escape format (\u{XXX} where each X is a hex byte) has not been ported along with it. This makes inserting Unicode characters reliant upon copy-and-paste or concatenation, neither of which are ideal for large-scale operations. This is also an incomplete port of the utf8 functionality, which may confuse new users.

This should also be accompanied by a porting of the the hex escape format (\xXX where XX is a hex codepoint) so as to allow consistency between Lua documentation and the behavior within Roblox.

If Roblox were to port these formats it would make third part resources less confusing, complete the porting of Lua’s utf8 functionality, and make inserting unicode characters faster. On a more meta-note, it would bring Roblox’s variant of Lua more inline with the standard for unicode and ascii string escapes, making it more accessible to those who have experience in other languages or other versions of Lua (LuaJIT, Lua 5.2+, etc.).

As it stands, the only way to insert characters that have a multi-byte codepoint is to copy and paste from an external site; memorize keyboard combos; string concatenation along with an accompanying execution of utf8.char, which can be expensive; or to use the currently supported decimal escape format to insert specific bytes of that character, which is not ideal ( is \226\130\172 if we go by this method). If the unicode escape pattern were ported, it would change that to \u{20ac} which is much more familiar looking and shows the raw codepoint of the character.

It’s also worth porting the hex escape format for the same reason (\169 vs \xA9 for the copyright symbol), as it’s featured in a myriad of documentation online, including the developer hub.

14 Likes