Heads up, I might have actually squashed a few bugs during the time you were modifying it. You might want to change your parts of code that I have edited.
chatgpt will turn the code into js
defining all functions as variables will have no improvement in fact I believe it could slow it down more, I tried to test it myself though and I couldn’t see any benefits when using locally declared functions.
this used to be a problem of the past though I remember it being discussed
try it for yourself replace the insert with table.insert if you want to try
local insert = table.insert
local Table1 = {}
local start1 = tick()
for i = 1, 10000000 do
insert(Table1,i)
end
print("code took "..tick() - start1.." ms!")
task.wait()
local start2 = tick()
local Table2 = {}
for i = 1, 10000000 do
table.insert(Table2,i)
end
print("code took "..tick() - start2.." ms!")
Most of the time the second print will show that it took slightly more but it doesnt have to do with the speed of the function but the first table slowing the code down switching the functions will reveal that again the second print will be slower
I consider comparing loops with a compression algorithm invalid, try it with the actual module (and its versions) with multiple iterations.
unfortunately, the revised module still seems to choke on bytes \0
and \255
. this seems to be caused by off-by-one errors on lines 27 and 39 (1, 31
and 1, 255 - 181
; should seemingly be 0, 31
and 1, 255 - 180
).
however, my attempted fix causes a different error to appear, on line 76 due to tobase92
being called with a nil argument.
i don’t understand the code enough to properly fix this. can somebody more knowledgeable please take a look?
I actually fixed it a while back for myself, but I never bothered to post a new reply, so here you go.
-- Module by 1waffle1 and boatbomber, optimized and fixed by iiau
-- https://devforum.roblox.com/t/text-compression/163637/37
local dictionary = {} -- key to len
do -- populate dictionary
local length = 0
for i = 32, 127 do
if i ~= 34 and i ~= 92 then
local c = string.char(i)
dictionary[c] = length
dictionary[length] = c
length = length + 1
end
end
end
local escapemap_126, escapemap_127 = {}, {}
local unescapemap_126, unescapemap_127 = {}, {}
local blacklisted_126 = { 34, 92 }
for i = 126, 180 do
table.insert(blacklisted_126, i)
end
do -- Populate escape map
-- represents the numbers 0-31, 34, 92, 126, 127 (36 characters)
-- and 128-180 (52 characters)
-- https://devforum.roblox.com/t/text-compression/163637/5
for i = 0, 31 + #blacklisted_126 do
local b = blacklisted_126[i - 31]
local s = i + 32
-- Note: 126 and 127 are magic numbers
local c = string.char(b or i)
local e = string.char(s + (s >= 34 and 1 or 0) + (s >= 91 and 1 or 0))
escapemap_126[c] = e
unescapemap_126[e] = c
end
for i = 1, 255 - 180 do
local c = string.char(i + 180)
local s = i + 34
local e = string.char(s + (s >= 92 and 1 or 0))
escapemap_127[c] = e
unescapemap_127[e] = c
end
end
local function escape(s)
-- escape the control characters 0-31, double quote 34, backslash 92, Tilde 126, and DEL 127 (36 chars)
-- escape characters 128-180 (53 chars)
return string.gsub(string.gsub(s, '[%c"\\\126-\180]', function(c)
return "\126" .. escapemap_126[c]
end), '[\181-\255]', function(c)
return "\127" .. escapemap_127[c]
end)
end
local function unescape(s)
return string.gsub(string.gsub(s, "\127(.)", function(e)
return unescapemap_127[e]
end), "\126(.)", function(e)
return unescapemap_126[e]
end)
end
local b93Cache = {}
local function tobase93(n)
local value = b93Cache[n]
if value then
return value
end
local c = n
value = ""
repeat
local remainder = n % 93
value = dictionary[remainder] .. value
n = (n - remainder) / 93
until n == 0
b93Cache[c] = value
return value
end
local b10Cache = {}
local function tobase10(value)
local n = b10Cache[value]
if n then
return n
end
n = 0
for i = 1, #value do
n = n + math.pow(93, i - 1) * dictionary[string.sub(value, -i, -i)]
end
b10Cache[value] = n
return n
end
local function compress(text)
assert(type(text) == "string", "bad argument #1 to 'compress' (string expected, got " .. typeof(text) .. ")")
local dictionaryCopy = table.clone(dictionary)
local key, sequence, size = "", {}, #dictionary
local width, spans, span = 1, {}, 0
local function listkey(k)
local value = tobase93(dictionaryCopy[k])
local valueLength = #value
if valueLength > width then
width, span, spans[width] = valueLength, 0, span
end
table.insert(sequence, string.rep(" ", width - valueLength) .. value)
span += 1
end
text = escape(text)
for i = 1, #text do
local c = string.sub(text, i, i)
local new = key .. c
if dictionaryCopy[new] then
key = new
else
listkey(key)
key = c
size += 1
dictionaryCopy[new] = size
dictionaryCopy[size] = new
end
end
listkey(key)
spans[width] = span
return table.concat(spans, ",") .. "|" .. table.concat(sequence)
end
local function decompress(text)
assert(type(text) == "string", "bad argument #1 to 'decompress' (string expected, got " .. typeof(text) .. ")")
local dictionaryCopy = table.clone(dictionary)
local sequence, spans, content = {}, string.match(text, "(.-)|(.*)")
local groups, start = {}, 1
for span in string.gmatch(spans, "%d+") do
local width = #groups + 1
groups[width] = string.sub(content, start, start + span * width - 1)
start = start + span * width
end
local previous
for width, group in ipairs(groups) do
for value in string.gmatch(group, string.rep(".", width)) do
local entry = dictionaryCopy[tobase10(value)]
if previous then
if entry then
table.insert(dictionaryCopy, previous .. string.sub(entry, 1, 1))
else
entry = previous .. string.sub(previous, 1, 1)
table.insert(dictionaryCopy, entry)
end
table.insert(sequence, entry)
else
sequence[1] = entry
end
previous = entry
end
end
return unescape(table.concat(sequence))
end
return { compress = compress, decompress = decompress }
It turns out I didn’t need to change it to base-92. Instead I found out that the escapemap
had overlap with character codes required to unescape it. So I have to make an entire dictionary just for unescaping the escape characters.
I’m currently using this live production in one of my games, Felandia, serving hundreds of daily players and I have not experienced any issues whatsoever.
Enjoy!
I’m exporting training data for a neural network and this compressed my json file from 31 script objects (199999 characters per script) to just 13. Good stuff!
I’ve been rewriting this code into Typescript yesterday for my Datastore optimizations and sharding mechanics.
However as I read conversation, it looks like there are many improved versions, would someone point out which of these versions are production stable?
Thanks alot!
I went through most (some had problems), but the one modified by @iiau has been working great.
[EDIT] By using maybeyield from this reply by Anaminus, post Reply from Anaminus I was able to yield every .5 sec or so to avoid exhaust from occouring, just leaving this if anyone else run into this issue
Great module! One issue I run into is when compressing big texts is the “Script timeout: exhausted allowed execution time” Would it be possible to implement some way where you can customize how much of the processing power the system uses, lets say 70% of cpu or something along that?