The Problem
Hello! First topic! Yay! I’m having trouble getting the number of characters in a string accurately. I have scoured the internet and the DevForum but there’s a zillion different articles and forums for all sorts of apps and programming languages, and I couldn’t find a real topic about it here.
What I’m trying to do is get the length of a string and check if it is a single character long. I’m doing this for a system where a user can display one letter/symbol/emoji/character of their choice on a flag (filtered with Roblox just in case). However, not all symbols are one character long; they have multiple bytes in them!
For example, if I print the length of the letter e:
print(#'e')
It will output 1 as it should since there is a single letter e. However, if I try printing the length of the zany emoji :
print(#'🤪')
It will output 4 even though there is clearly one emoji there. I also can’t just check if the length is less than 5 characters because then someone can get away with ‘Ayyo’ on their flag since each of those letters is 1 byte each. The user should only be able to put a single character, not 4 symbols!
To add to the problem, emojis like are 3 bytes in length, and I couldn’t find an example of one but there are likely also some emojis that are 2 bytes long! I’m kind of stumped on how to solve this one. I’ve figured out many complicated things before, and yet this seems to be such a simple problem with a lack of a simple solution.
Things That Didn't Work/Give What I Want
Using string.len() instead of the length (#) operator
print(string.len('🤪')) -- 4
Notes: This does not output any differently than just doing print(#'🤪')
.
Using string.split() and getting the length of the returned table
print(#string.split('🤪','')) -- 4
Notes: string.split() does not respect how the bytes make different characters, and simply splits them apart. Once it does this and tries to display them, it simply gives you the unknown character thingy that looks like <?> as those bytes aren’t meant to make anything by themselves it seems.
HttpService’s JSON functions
local Http = game:GetService("HttpService")
print(#Http:JSONDecode(Http:JSONEncode('🤪'))) -- 4
print(#Http:JSONDecode(Http:JSONEncode({'🤪'}))[1]) -- Also 4
Notes: JSON keeps it just as it was, which is probably how it’s meant to work anyway.
string.byte()
local str = '🤪'
print(#{string.byte(str,1,#str)}) -- 4
print(#{string.byte(str,1,9999)}) -- Also 4
Notes: string.byte() seems to be getting the individual bytes of a symbol, which really isn’t all that helpful as that’s not what want to do, but instead we want to get the individual characters.
Using double quotes
print(#"🤪") -- 4
Notes: There’s obviously no difference lol