I am scripting a procedural voxel terrain generation module and it’s been going successfully well so far. Currently it only takes in numerical seeds however I wanted a system where you can also input string-type seeds and that would get hashed into a numerical seed, like Minecraft or Terraria. According to my research, Minecraft does this through Java’s hashCode() function and in native form this would be:
public int hashCode() {
int h = hash;
if (h == 0) {
int off = offset;
char val[] = value;
int len = count;
for (int i = 0; i < len; i++) {
h = 31*h + val[off++];
}
hash = h;
}
return h;
}
And so according to a 2 year old devforum post I found this would be implemented as s[0]*31^(n - 1) + s[1]*31^(n - 2) + ... + s[n - 1] or in lua it would be:
local max32 = math.pow(2, 32) - 1
local function HashCode(str)
local val = 0
local n = str:len()
for i = 1, n do
val = (val + str:byte(i) * math.pow(31, n - i)) % max32
end
return val
end
I tried this and surprisingly, it works - to a certain extent. I checked the seed’s true numerical hash through a 3rd party website, generated 2 separate worlds in minecraft with seeds being the string-type from this code and the one converted to its numerical form by the website and both worlds were the same. However, this only works for up to 5 characters. I don’t know why, but if I tried using the seed roblox, it would output an inaccurate numerical seed. This isn’t the case with roblo. Can anyone please help me why this doesn’t work to it’s most accurate point?
I should mention that hashCode only works hashing up to 32-bit signed integers, however that’s the least of my concerns because I know for certain that a 6 character string seed wouldn’t output more than that.
Nice to have you reply. And to put this into perspective, I used 2 sources to check against the hash output from your code. One was the third party website and the other was simply generating 2 worlds in Minecraft using the string seed and the “supposed” numerical seed.
Here’s what happens with a 5 character seed:
Your code
3rd party website
Here’s what happens with anything longer than 5 characters:
Your code
3rd party website
They’re certainly not hitting the 32-bit signed integer limit, though I’m interested as to why it only works for up to 5 characters.
They’re certainly not hitting the 32-bit signed integer limit
Actually they are, int32s are only in the range of +/-2 billion give or take.
Java’s hashCode returns a signed int, while mine returns an “unsigned int”*.
So if you treat my functions output as a signed int instead, by subtracting max32 if it’s bigger than an ints max value, you get the same answer:
108685832 → not bigger than 2^31-1 → do nothing
3369260912 → bigger than 2^31-1 → subtract 2^32 → -925706384
Corrected lua function that does this:
local maxUInt32 = 4294967296
local maxInt32 = 2147483647
local function HashCode(str)
local val = 0
local n = str:len()
for i = 1, n do
val = (val + str:byte(i) * math.pow(31, n - i)) % maxUInt32
end
return val > maxInt32 and val - maxUInt32 or val
end
print(HashCode("roblo"))
print(HashCode("roblox"))
*Actually, my function returns a double floating point, because all lua numbers are doubles, but I treat it like a uint32 to emulate java