Give the solution to @Nitefal, but I just want to add some extra considerations for what you’re looping with (my basis for these are from this post):
Let’s have the string <i>some</i> some
which would show up as “some some”
string.len()
would also rich-text tags (like <i>
and </i>
as seperate characters, which is why it would return 16 characters. The post provides an example of a function that can be used to filter out tags:
local function removeTags(str)
-- replace line break tags (otherwise grapheme loop will miss those linebreak characters)
str = str:gsub("<br%s*/>", "\n")
return (str:gsub("<[^<>]->", ""))
end
warn(string.len("<i>some</i> some")) -- 16,, includes tags :(
warn(string.len(removeTags("<i>some</i> some"))) -- 9,, yay!!
Now for emojis, let’s have the string “hi ”
string.len()
considers emojis as 4 characters (because emojis are 4 bytes). The solution is to use utf8.len()
instead which correctly considers a single emoji to be a single character.
warn(string.len("hi 🔥")) -- 7,, because an emoji equals 4...
warn(utf8.len("hi 🔥")) -- 4,, yay!!