String-Seed to Numerical Seed for procedural terrain generation (like minecraft) only works up to 5 characters

I am scripting a procedural voxel terrain generation module and it’s been going successfully well so far. Currently it only takes in numerical seeds however I wanted a system where you can also input string-type seeds and that would get hashed into a numerical seed, like Minecraft or Terraria. According to my research, Minecraft does this through Java’s hashCode() function and in native form this would be:

public int hashCode() {
       int h = hash;
       if (h == 0) {
           int off = offset;
           char val[] = value;
           int len = count;

           for (int i = 0; i < len; i++) {
               h = 31*h + val[off++];
           }
           hash = h;
       }
       return h;
   }

And so according to a 2 year old devforum post I found this would be implemented as s[0]*31^(n - 1) + s[1]*31^(n - 2) + ... + s[n - 1] or in lua it would be:

local max32 = math.pow(2, 32) - 1

local function HashCode(str)
	local val = 0

	local n = str:len()

	for i = 1, n do
		val = (val + str:byte(i) * math.pow(31, n - i)) % max32
	end

	return val
end

I tried this and surprisingly, it works - to a certain extent. I checked the seed’s true numerical hash through a 3rd party website, generated 2 separate worlds in minecraft with seeds being the string-type from this code and the one converted to its numerical form by the website and both worlds were the same. However, this only works for up to 5 characters. I don’t know why, but if I tried using the seed roblox, it would output an inaccurate numerical seed. This isn’t the case with roblo. Can anyone please help me why this doesn’t work to it’s most accurate point?

I should mention that hashCode only works hashing up to 32-bit signed integers, however that’s the least of my concerns because I know for certain that a 6 character string seed wouldn’t output more than that.

Hi I wrote that 2 year old post

What’s the 3rd party website you’re using? How do you know what to expect the hash to be, and what are you getting instead with my code?

1 Like

Nice to have you reply. And to put this into perspective, I used 2 sources to check against the hash output from your code. One was the third party website and the other was simply generating 2 worlds in Minecraft using the string seed and the “supposed” numerical seed.

Here’s what happens with a 5 character seed:
Your code
image
3rd party website
image

Here’s what happens with anything longer than 5 characters:
Your code
image
image

3rd party website
image
image

They’re certainly not hitting the 32-bit signed integer limit, though I’m interested as to why it only works for up to 5 characters.

They’re certainly not hitting the 32-bit signed integer limit

Actually they are, int32s are only in the range of +/-2 billion give or take.

Java’s hashCode returns a signed int, while mine returns an “unsigned int”*.

So if you treat my functions output as a signed int instead, by subtracting max32 if it’s bigger than an ints max value, you get the same answer:

108685832 → not bigger than 2^31-1 → do nothing

3369260912 → bigger than 2^31-1 → subtract 2^32 → -925706384

Corrected lua function that does this:

local maxUInt32 = 4294967296
local maxInt32 = 2147483647

local function HashCode(str)
	local val = 0

	local n = str:len()

	for i = 1, n do
		val = (val + str:byte(i) * math.pow(31, n - i)) % maxUInt32
	end

	return val > maxInt32 and val - maxUInt32 or val
end

print(HashCode("roblo"))
print(HashCode("roblox"))

*Actually, my function returns a double floating point, because all lua numbers are doubles, but I treat it like a uint32 to emulate java

2 Likes

Right I see, since I heard Minecraft worlds are generated in 64-bit signed integers which confused me there. Thanks for bothering to help me though! :slightly_smiling_face:

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.