(String Patterns) Splitting a string at commas while ignore nested commas

There’s no easy way to explain this but I shall try my best.

Let’s say I have this string (this is the general format, not the actual string): foo, bar(car, dad), ear. I want to split the string into: foo, bar(car, dad), and ear. That is, I want to split it by commas but not get bar(car and dad) out of it. I’m not sure how to do this without resorting to iterating through the entire string.

I’ve tried the pattern [^,]*%b() but it falls apart when there’s something like foo or ear involved, and as far as I’m aware there’s no way to make the balanced string pattern (%b) optional. If it’s not possible to do this, I’ll make do, but I would really like to avoid looping through all of the characters of the string, so help would be appreciated.

1 Like

This problem reminds me of parsing CSVs that have commas in the entries (ex: Test1,Test2,“Test3,Test4”,Test5). I am not sure if this is possible with how Lua handles string patterns with “%b”, since modifiers can’t be added, like “?”.
Here is a solution I just created that should work if you don’t find a solution with string patterns.

local function SplitString(String)
	local SplitStrings = {}
	local CurrentParentheses = 0
	local CurrentString = ""
	
	--Parse the string.
	for i = 1,#String do
		local Character = string.sub(String,i,i)
		local IgnoreCharacter = false
		
		--Handle special characters.
		if Character == "(" then
			--Add to the nesting counter.
			CurrentParentheses = CurrentParentheses + 1
		elseif Character == ")" then
			--Subtract from the nesting counter.
			CurrentParentheses = math.max(0,CurrentParentheses - 1)
		elseif Character == "," and CurrentParentheses == 0 then
			--Add and reset the string.
			table.insert(SplitStrings,CurrentString)
			CurrentString = ""
			IgnoreCharacter = true
		end
		
		--Add the character to the string.
		if not IgnoreCharacter and (CurrentString ~= "" or Character ~= " ") then
			CurrentString = CurrentString..Character
		end
	end
	
	--Add the last string in case there wasn't a trailing comma.
	if CurrentString ~= "" then
		table.insert(SplitStrings,CurrentString)
	end
	
	--Return the split strings.
	return SplitStrings
end

--Print the split string.
print(game.HttpService:JSONEncode(SplitString("foo, bar(car, dad), ear")))
print(game.HttpService:JSONEncode(SplitString("foo, bar(car, dad), ear, ")))
print(game.HttpService:JSONEncode(SplitString("foo, bar(car, foobar(dad, bar)), ear, ")))
4 Likes

Whenever you can’t do something in a single string pattern you can always use a sequence of them. Remember Lua regexp is quite lacking.

local function SplitString(str)
	local matches = {}
	str = str:gsub(", ", ",")
	str = str:gsub("%b()", function(m)
		table.insert(matches, m)
		return "%s"
	end)

	local separated = {}
	local i = 1
	str:gsub("[^,]+",function(m)
		if m:match("%%s") then
			m = m:format(matches[i])
			i = i + 1
		end
		table.insert(separated, m)
	end)

	return separated
end
6 Likes