Cant split string using words from other languages

You can write your topic however you want, but you need to answer these questions:

  1. What do you want to achieve? i want to split string using not an english word

  2. What is the issue? output:
    image

  3. What solutions have you tried so far? I tried to find a solution on devforum

local t = {}
for i,v in pairs(string.split('Привет','')) do -- Russian word
	table.insert(t,v)
end
print(t)
1 Like

This is due to each russian character being treated as multiple unicode characters which were combined.

Looking at the sample of you provided, I would recommending using utf8.graphemes(), which provides an iterable of the start and end locations of each grapheme (visible character - in this case the russian letters):

local t = {}
local text = 'Привет'
for first, last in utf8.graphemes(text) do
	table.insert(t, string.sub(text, first, last))
end
print(t)

image

thank you for helping me for the second time! i read the documentation about utf8 and here is the way i came up with:

local t = {}
local str = 'Привет'
for position, codepoint in utf8.codes(str) do
	table.insert(t,utf8.char(codepoint))
end
print(t)