Hey! Thanks for this code project!
I changed the decoder over to use buffers for a project of ours, thought I should share it here.
We saw an improvement of about 10,000x, going from 115 seconds to 15ms
function lz4.decompress(input: buffer, offset: number): buffer
offset = offset or 0
local compressedLen = buffer.readi32(input, offset + 0)
local decompressedLen = buffer.readi32(input, offset + 4)
-- bytes 8 - 11 are not used in this implementation
if compressedLen == 0 then
local output = buffer.create(decompressedLen)
buffer.copy(output, 0, input, offset + 12, decompressedLen)
return output
end
return lz4.decompressHeaderless(input, offset + 12, decompressedLen)
end
function lz4.decompressHeaderless(input: buffer, offset: number, decompressedLen: number): buffer
local output = buffer.create(decompressedLen)
local iHead = offset
local oHead = 0
while true do
local token = buffer.readu8(input, iHead)
local litLen = token // 0b10000
local matLen = token % 0b10000
iHead += 1
if litLen == 0b1111 then
repeat -- this tally system is actually not a smart way to encode length in general.
local extraLen = buffer.readu8(input, iHead)
litLen += extraLen
iHead += 1
until extraLen ~= 0b11111111
end
buffer.copy(output, oHead, input, iHead, litLen)
iHead += litLen
oHead += litLen
-- this is smart, it helps recuperate some of the losses at the beginning from storing the length
if oHead == decompressedLen then break end
local readback = buffer.readu16(input, iHead)
local copyStart = oHead - readback
iHead += 2
if matLen == 0b1111 then
repeat
local extraLen = buffer.readu8(input, iHead)
matLen += extraLen
iHead += 1
until extraLen ~= 0b11111111
end
matLen += 4
-- unfortunately, buffer.copy(output, oHead, output, copyStart, matLen) does not work
-- also why not do this for the literal read too?
while matLen > oHead - copyStart do
local maxLen = oHead - copyStart
buffer.copy(output, oHead, output, copyStart, maxLen)
matLen -= maxLen
oHead += maxLen
end
buffer.copy(output, oHead, output, copyStart, matLen)
oHead += matLen
if oHead == decompressedLen then break end
end
return output
end
I see that you were working on an implementation using the buffer object, could I see that link?
Also if you wanted to take a compression ratio hit for faster performance, you could totally chunk it up into several smaller pieces. With very large pieces of data, the compression ratio hit would probably be very small.