:gsub with either of two patterns

I want to separate a string into an array of words and punctuation at once, in the same order with :gsub . I don’t think there’s a way to pass a logical expression to a :gsub and make it choose either of two patterns, so I don’t know how to approach this.

Below is the theoretical form of what I’m trying to achieve:

local str = "Lorem ipsum dolor sit amet, consectetur adipiscing elit..."
local t = {}
str:gsub(
    "%a+" or "%p+",     -- Somehow split into words or punctuation
    function(c) table.insert(t,c) end
)
for _,v in ipairs (t) do print(v) end

Expected:

Lorem 
ipsum 
dolor 
sit 
amet
, 
consectetur 
adipiscing 
elit
...

Any help is appreciated, thank you in advance!

1 Like

Although I found out it’s impossible to do alternations with Lua patterns, I solved it with a little manual trimming. This works as expected:

local str = "Lorem ipsum dolor sit amet, consectetur-adipiscing elit..."
local t = {}

while true do
    local first = str:match("^%w+") or str:match("^%p+")
    if not first then break end
    table.insert(t,first)
    local trimmed = str:gsub(first,"",1)
    str = trimmed:gsub("^%s+","",1)
end

for _,v in ipairs (t) do print(v) end

So, what oyu expect in output?
This?

Lorem 
ipsum 
dolor 
sit 
amet
, 
consectetur 
adipiscing 
elit
...

I mean, I don’t think you can do this without using spaces between amet, and elit …
you can do this though

local str = "Lorem ipsum dolor sit amet , consectetur adipiscing elit ..." 

for i,v in pairs(str:split(" ")) do
  print(v)
end

You can just combine them into a single class:

local first = str:match("^[%w%p]+")
1 Like

I’m running it through a large plaintext file so I don’t think it’s practical to do this. Appreciate the help.

This matched only consectetur-adipiscing, but thank you for the help!