Utf8.upper and utf8.lower

As a Roblox developer, it is currently too hard to ensure efficient and reliable case conversion between lower and uppercase characters that are represented by >1 byte.

This is a very simple feature request. As is commonly known, some of the string library’s functions generally are only guaranteed to work with 1-byte characters, in this case the relevant functions/bases of this feature request are the upper/lower functions. Without writing a custom upper/lower implementation, it is not possible to catch every possible lower/uppercase letter and switch it to the opposite case form.

The characters that are most commonly affected by the string library’s limitations are largely non-Latin characters. The major non-Latin languages that use upper and lowercase forms that I can think of that have applications in Roblox are the Cyrillic (Russian, Ukrainian) alphabet and the Greek alphbet. That being said, there are still Latin characters that take up more than 1 byte (the Latin extended unicode blocks).

Details aside, the utf8 library’s functions are designed specifically to function with every valid UTF-8 encoded character. As well, there are many UTF-8 library functions that correspond with string lib functions (utf8.graphemes = string.gmatch, utf8.char = string.char, utf8.codes = string.byte, etc…) thus this function would fit right in with the rest of the library’s functions.

If Roblox is able to address this issue, it would improve my development experience because I would have an option that allows me to easily convert every Unicode character from upper to lowercase and vice versa.

Some use cases that these functions would solve

1- Wanting to add tone to a string. If you want to represent someone screaming, typically you’d do so with uppercases. A workaround could be to use rich text tags (<uc>) however rich text tags do not allow for case conversion. We also don’t have a lowercase tag if someone did want to convert a string from upper to lowercase.

2- Wanting to manipulate a string. Some time ago I came across someone in #help-and-feedback:scripting-support wanting to replace a character that is preceded by a caret (^) with an uppercase letter. They noticed the string library did not catch most characters with diatrics.

I’m sure there are more use cases but these are the ones I’ve encountered firsthand thus far.

11 Likes

Agreed, string.upper() and string.lower() only supports latin letters, I sometimes want to be able to put lowercase letters in other scripts, like for Ohio Hide and Seek, I want to make sure that I am able to cause destruction to the monsters when I just say “ы”, regardless of the case, your first use case was somehow placed by Curse Randomizer (only found it from crainer’s video, haven’t actually played the game soo yep)

Duplicate of a thread I made with the same title Utf8.upper and utf8.lower

3 Likes

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.