How can I split a string but still maintain the space?

I’m trying to create my own RichText system, and I am using string.split to split a message into seperate words, however I can’t get it to keep the spaces as well

local Text = "Hello <Font=GothamBold>World"

string.split(Text, " ")

Returns

1 Text
2 <Font=GothamBold>
3 World

I’m then using

local MarkupKey, MarkupValue = string.match(word, "<(.+)=(.+)>")

To locate anything with <> (in my example, <Font=GothamBold> as I want to have these words ignored.

But I still want a space between Hello and World, atm it’s not getting a space.

1 Like

Surely if you’re removing any markup text, you’ll get a list of words that you know have a space inbetween?

Save your split in a table. Perform table.remove on any indices that match your tag pattern. Then do table.concat with a space as the delimiter on what’s left.

The example you have provided would not yield the output you have suggested it would as there is no space between <Font=GothamBold> and World.

I would advise taking a different approach to this, first by stripping out all tags:

local originalText = "Hello <Font=GothamBold>World"
local formattedText = originalText:gsub("%b<>", "") --> "Hello World"

Once you’ve done this, you can split the string up as you wish:

local words = formattedText:split(" ")

However, split will remove all spaces from the word. You can use gmatch instead to iterate all words including their spaces.

I still want to be able to get any markups tho <>?? As well is it being in a table format

-- Split text up
	local SplitText = string.split(text, " ")
	
	for i, word in pairs(SplitText) do
		local MarkupKey, MarkupValue = string.match(word, "<(.+)=(.+)>")
		
		if MarkupKey and MarkupValue then
			-- Markup found
			if not ApplyMarkup(MarkupKey, MarkupValue) then
				-- Invalid markup
				warn("Could not apply markup: ", word)
			end			
		else
			-- No markup, print text as normal
			PrintText(word)
		end
	end

I see, I think you’ll need to tell us a little more about your markup so we can provide better support.

  • Does it have closing tags? (e.g </Font>)
  • What if there are two markup tags one after another?

ApplyMarkup function handles the markups. Basically MarkupKey and MarkupValue should be
Font GothamBold
And then I can do if key == Font then do font stuff with value. If value == / then revert the key back to the default settings I have stored.

A longer text example

local Text = "Hello <Font=GothamBold><Yield=2> World!<Font=/>"

I want to result as
Hello World!
*Yield will basically delay how long before showing the text after it. So Hello (wait 2 seconds) World!

But because I’m creating each letter as a separate textlabel. I’m also creating a textlabel for the word (the text stays invisible, but more a holder to show what the text would be


Group001 has text as Hello, and each character underneath it is the individual letters (H, e, l, l, o)

1 Like

Edit: Don’t use this version of the parser, it’s not as good as the one I posted below.


Great! This helps a lot. In which case it sounds like you’re trying to create a list of words separated by tags. Personally I would handle this using a custom iterator:

local function parse(markup)
	local snippets = markup:gsub("%b<>", "<>"):split("<>")
	local iterator = ("<>" .. markup):gmatch("%b<>")
	return function (lastTag, lastSnippet)
		return iterator(lastTag), table.remove(snippets, 1)
	end
end

for tag, text in parse(text) do
	if tag then
		local key, value = tag:match("<(.+)=(.+)>")
		-- handle markup
	end
	-- handle text after markup
end

The iterator parse puts your text into tag, text pairs for you to handle as required. For example, the output of the string "Hello <Font=GothamBold><Yield=2> World!<Font=/>" would be:

"<>"					"Hello "
"<Font=GothamBold>"		""
"<Yield=2>"				" World!"
"<Font=/>"				""

You’ll notice there is a <> this represents an empty tag and should be ignored when handling your markup.

Hmmm ok, I’ll give that a try, thank you! :smiley:

Can I ask, what’s the difference between using like

local Text = "Test"
string.match(Text, "<(.+)=(.+)>")
Text:match("<(.+)=(.+)>")

Is there a reason for having either or, or just personal preference?

It’s personal preference. I believe using the static methods (string.match) is more efficient, but for the most part this should be negligible.

1 Like

I’m only getting 1 word returned :confused:

for tag, text in Parse(text) do
		if tag then
			print(tag, text)
			local key, value = string.match(tag, "<(.+)=(.+)>")
			if key and value then
				if not ApplyMarkup(key, value) then
					-- Invalid markup
					warn("Could not apply markup: ", text)
				end	
			else
				PrintText(text)
			end
		end
		-- handle text after markup
	end

-- What I'm passing
"Hello <Font=GothamBold><Yield=1>World!<Emote=Happy>"

Okay, I thought about this some more and I came up with a nicer iterator for this. Rather than trying to separate out text as its own thing, it converts text into its own tag. So the string:

"Hello <Font=GothamBold><Yield=1>World!<Emote=Happy>"

becomes:

"<Text=Hello ><Font=GothamBold><Yield=1><Text=World!><Emote=Happy><Text=>"

Then it’s just as simple as reading each tag one-by-one.

for tag, key, value in parseV2(text) do
	if key == "Text" then
		-- do text stuff
	else
		-- do other stuff
	end
end

The iterator is as follows:

local function parseV2(markup)
	local formattedMarkup = ("<Text=" .. markup:gsub("%b<>", ">%1<Text=") .. ">")
	local iterator = formattedMarkup:gmatch("%b<>")
	return function (lastTag)
		local nextTag = iterator(lastTag)
		if nextTag == nil then
			return nil
		end
		return nextTag, nextTag:match("<(.*)=(.*)>")
	end
end
1 Like

So any normal text I’d have to write with <Text=Normal text>??

Nope, it will automatically add these tags for you.


The following code:

local text = "Hello <Font=GothamBold><Yield=2> World!<Font=/>Boop"

for _, key, value in parseV2(text) do
	print(key, value)
end

Gives the output:

Text	Hello 
Font	GothamBold
Text	
Yield	2
Text	 World!
Font	/
Text	Boop
1 Like

It works! :grimacing: :smiley:

However, it doesn’t work if I do this

"Hello <Font=Gotham><Yield=1>World!<Emote=Happy> I am just writing more so it's more lines lelelelel ahahaa"

As when I do

if key == "Text" then
    print(value)
end

I get

It between tests I was thinking though, the main reason I’m writing my own is because I’ve been using Defaultio’s one (I’ve taken bits and pieces of his and rewritten it in my own kinda way) is because I want to control certain things at certain periods in the string.

If I have this for example

local Text = "You are. <Yield=0.5>.<Yield=0.5>.<Yield=1>CORRECT!<Emote=Happy><Sound=Correct>"

The problem with his one was that it was collecting all this before and so the emote happy was occuring before the actual text had finished. So I was thinking if it would be better to have 2 tag systems??
<> - for anything relating to the text (such as Font, Color, Size, etc.)
** - anything to related to a timed thing, like animations, emotes, sounds, etc.

And so I’d have it go through the string. Remove all **. Then find any <>, apply markups to the text. Then when I’m displaying the text later on, look for ** and have it apply the markups for those when that set of text appears on screen?

This is because it adds empty text tags to the original markup between existing tags. To fix this, you can change the iterator to:

local function parseV3(markup)
	local formattedMarkup = ("<Text=" .. markup:gsub("%b<>", ">%1<Text=") .. ">")
	local iterator = formattedMarkup:gmatch("%b<>")
	return function (tag, key, value)
		repeat
			tag = iterator(tag)
			if not tag then
				return nil, nil, nil
			end
			key, value = tag:match("<(.+)=(.+)>")
		until key and value
		return tag, key, value
	end
end

This will skip over invalid tags (those with an empty key and value).

Get this error
attempt to index nil with ‘match’

on this line

key, value = tag:match("<(.+)=(.+)>")

Assuming I keep this for loop the same??

for tag, key, value in parseV3(text) do

end

??

My bad, I’ve updated the code above.

Still only shows this :confused:

Still struggeling with this if anyone can offer any further input :confused: