Luau CRC 32 - fast hashing

Introduction

Hey. :wave: I converted this simple CRC 32 hash/integrity check script to full Luau optimizations and added support for buffers. I also added an optional integer size for reading buffers.

I got the original code from Wikipedia’s/Wikimedia’s/MediaWiki’s internal code that is public: Module:Crc32lua - Meta Module:Crc32lua - MediaWiki

What is CRC 32 and how does it work?

CRC 32 is a very simple and fast hashing algorithm mostly used for integrity checking.
Unlike most hashing algorithms like SHA256, CRC 32 attempts to be as fast and simple as possible. This does mean that it’s not cryptographically secure.

CRC 32 works by combining two 32 bit unsigned integers and returns its hash as a single 32 bit unsigned integer (which is any number between 0 and 2^32-1 or 4294967295, and conveniently also the maximum number the 32 bit library can support).

CRC 32 hashes strings and buffers by hashing them a single byte at a time with input 1 being the byte being read and input two being the last hash CRC 32 computed.
At the start of the string the last hash is 0 by default unless a specific last hash value is fed to the function.

Usage

CRC.crc32(value, [, lastcrc, size])

To use the CRC 32 function you have to call it via CRC.32(value).
Accepted types of values for CRC32 include: strings, buffers and numbers that are unsigned 32 bit integers. The function will automatically work for all said types.

The CRC32 function has two optional parameters that are lastcrc and size.

  • lastcrc is the value fed to the first starting CRC input number 2 value, the default being 0. This is useful if you want to establish a continuous chain of integrity checking.
  • size is an optional bit size for reading buffers in. By default buffers are read in bytes (also known as 8 bit unsigned integers). Other available options are 16 bit and 32 bit unsigned integers, being 2 bytes and 4 bytes wide respectively.

Internal functions

The CRC 32 library also exposes the underlying functions for reading values. There is no reason for users to use these instead of CRC.crc32(value). The functionality of each is left for the viewer to discern.

  • CRC.crc32_byte(value, lastcrc)
  • CRC.crc32_string(value, lastcrc)
  • CRC.crc32_buffer(value, lastcrc, size)
  • CRC.crc_table
  • CRC.bit

Installation

To install this create a modulescript named CRC containing the below code and require it in your script via local CRC = require("./CRC"), local CRC = require("@self/CRC") or local CRC = require(script.CRC)

--!native
--!optimize 2
--[[

LUA MODULE

  digest.crc32 - CRC-32 checksum implemented entirely in Lua.

SYNOPSIS

  local CRC = require "digest.crc32lua"
  print(CRC.crc32 "test") --> 0xD87F7E0C or -662733300
  
  assert(CRC.crc32("st", CRC.crc32("te")) == CRC.crc32 "test")
  
DESCRIPTION

  This can be used to compute CRC-32 checksums on strings.
  This is similar to [1-2].

API

  Note: in the functions below, checksums are 32-bit integers stored in
  numbers.  The number format currently depends on the bit
  implementation--see DESIGN NOTES below.

  CRC.crc32_byte(byte [, crc]) --> rcrc
  
    Returns CRC-32 checksum `rcrc` of byte `byte` (number 0..255) appended to
    a string with CRC-32 checksum `crc`.  `crc` defaults to 0 (empty string)
    if omitted.

  CRC.crc32_string(s, crc) --> bcrc

    Returns CRC-32 checksum `rcrc` of string `s` appended to
    a string with CRC-32 checksum `crc`.  `crc` defaults to 0 (empty string)
    if omitted.
  
  CRC.crc32(o, crc) --> bcrc

    This invokes `crc32_byte` if `o` is a byte or `crc32_string` if `o`
    is a string.
  
  CRC.bit

    This contains the underlying bit library used by the module.  It
    should be considered a read-only copy.

DESIGN NOTES

  Currently, this module exposes the underlying bit array implementation in CRC
  checksums returned.  In BitOp, bit arrays are 32-bit signed integer numbers
  (may be negative).  In Lua 5.2 "bit32" and "bit.numberlua", bit arrays are
  32-bit unsigned integer numbers (non-negative).  This is subject to change
  in the future but is currently done due to (unconfirmed) performance
  implications.
  
  On platforms with 64-bit numbers, one way to normalize CRC
  checksums to be unsigned is to do `crcvalue % 2^32`,
  
  The name of this module is inspired by Perl `Digest::CRC*`.

DEPENDENCIES
 
  Requires one of the following bit libraries:

    BitOp "bit" -- bitop.luajit.org -- This is included in LuaJIT and also available
      for Lua 5.1/5.2.  This provides the fastest performance in LuaJIT.
    Lua 5.2 "bit32" -- www.lua.org/manual/5.2 -- This is provided in Lua 5.2
      and is preferred in 5.2 (unless "bit" also happens to be installed).
    "bit.numberlua" (>=000.003) -- https://github.com/davidm/lua-bit-numberlua
      This is slowest and used as a last resort.
      It is only a few times slower than "bit32" though.

DOWNLOAD/INSTALLATION

  If using LuaRocks:
    luarocks install lua-digest-crc32lua

  Otherwise, download <https://github.com/davidm/lua-digest-crc32lua/zipball/master>.
  Alternately, if using git:
    git clone git://github.com/davidm/lua-digest-crc32lua.git
    cd lua-digest-crc32lua
  Optionally unpack:
    ./util.mk
  or unpack and install in LuaRocks:
    ./util.mk install 
  
REFERENCES

  [1] http://www.axlradius.com/freestuff/CRC32.java
  [2] http://www.gamedev.net/reference/articles/article1941.asp
  [3] http://java.sun.com/j2se/1.5.0/docs/api/java/util/zip/CRC32.html
  [4] http://www.dsource.org/projects/tango/docs/current/tango.io.digest.Crc32.html
  [5] http://pydoc.org/1.5.2/zlib.html#-crc32
  [6] http://www.python.org/doc/2.5.2/lib/module-binascii.html
 
LICENSE

  (c) 2008-2011 David Manura.  Licensed under the same terms as Lua (MIT).
  (c) 2025 ccuser44/VortexColor. Licensed under the same terms as Lua (MIT).

  Permission is hereby granted, free of charge, to any person obtaining a copy
  of this software and associated documentation files (the "Software"), to deal
  in the Software without restriction, including without limitation the rights
  to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
  copies of the Software, and to permit persons to whom the Software is
  furnished to do so, subject to the following conditions:

  The above copyright notice and this permission notice shall be included in
  all copies or substantial portions of the Software.

  THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
  IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
  FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL THE
  AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
  LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
  OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
  THE SOFTWARE.
  (end license)
 
--]]

local M = {_TYPE="module", _NAME="digest.crc32", _VERSION="0.3.20111128"}

--local type = type
--local require = require
--local setmetatable = setmetatable

--[[
 Requires the first module listed that exists, else raises like `require`.
 If a non-string is encountered, it is returned.
 Second return value is module name loaded (or "").
 --]]
local function requireany(...)
	local errs = {}
	for _, name in ipairs({...}) do
		--if type(name) ~= "string" then return name, "" end
		local ok, mod = pcall(require, name)
		if ok then return mod, name end
		errs[#errs + 1] = mod
	end
	error(table.concat(errs, "\n"), 2)
end

--local bit, name_ = requireany("bit32", "bit", "bit.numberlua")
--local bxor = bit.bxor
--local bnot = bit.bnot
--local band = bit.band
--local rshift = bit.rshift

-- CRC-32-IEEE 802.3 (V.42)
local POLY = 0xEDB88320

-- Memoize function pattern (like http://lua-users.org/wiki/FuncTables ).
local function memoize(f)
	local mt = {}
	local t = setmetatable({}, mt)
	function mt:__index(k)
		local v = f(k); t[k] = v
		return v
	end
	return t
end

-- CRC table.
local crc_table = memoize(function(i)
	local crc = i
	for _ = 1, 8 do
		local b = bit32.band(crc, 1)
		crc = bit32.rshift(crc, 1)
		if b == 1 then crc = bit32.bxor(crc, POLY) end
	end
	return crc
end)
M.crc_table = crc_table

local function M_crc32_byte(byte, crc)
	crc = bit32.bnot(crc or 0)
	local v1 = bit32.rshift(crc, 8)
	local v2 = crc_table[bit32.bxor(crc % 256, byte)]
	return bit32.bnot(bit32.bxor(v1, v2))
end
M.crc32_byte = M_crc32_byte

local function M_crc32_string(s, crc)
	crc = crc or 0
	for i = 1, #s do
		crc = M_crc32_byte(string.byte(s, i), crc)
	end
	return crc
end
M.crc32_string = M_crc32_string

local function M_crc32_buffer(s, crc, size)
	crc = crc or 0
	size = size or 8
	if size == 8 then -- If stats with code dublication are best to avoid recurring branches
		for i = 0, buffer.len(s) - 1 do
			crc = M_crc32_byte(buffer.readu8(s, i), crc)
		end
	elseif size == 16 then
		for i = 0, math.floor(buffer.len(s) / 2) - 1 do
			crc = M_crc32_byte(buffer.readu16(s, i), crc)
		end
	elseif size == 32 then
		for i = 0, math.floor(buffer.len(s) / 4) - 1 do
			crc = M_crc32_byte(buffer.readu32(s, i), crc)
		end
	end
	return crc
end
M.crc32_buffer = M_crc32_buffer

function M.crc32(s, crc, size)
	if type(s) == "string" then
		return M_crc32_string(s, crc)
	elseif type(s) == "buffer" then
		return M_crc32_buffer(s, crc, size)
	else
		return M_crc32_byte(s, crc)
	end
end

M.bit = bit32 -- bit library used

return M
6 Likes