How to efficiently store CFrame values in datastores?

I have a sandbox building game based on a simple grid system where users place parts to build on their base. I use a datastore to save and load each player’s base when they play the game. I am currently saving the position and orientation of all parts by saving each part’s CFrame data into a datastore.

However, the problem with this is that the positional CFrame values are very large. I believe this happens because of a loss of precision, so the X, Y, and Z values of a CFrame are never precise enough. For example, the X value being stored in the datastore is -42.00000762939453 instead of being just -42.

I have tried rounding these values to 2-3 decimal places, however after many save/load cycles parts start to slowly shift away from each other. My guess is that each time the game loads the rounded values the new CFrame values become large again and they just keep getting rounded over and over resulting in lost precision.

So my question is, is there a more efficient or better way to store part CFrame or position/orientation values in a datastore? Or am I doing something wrong? If I do not round these values they seem to work fine, however they take up way too much base storage with how many decimal places there are.

If you are worried about precision of numbers, you may want to try using logarithms or multiplying a by scale factor. For example, if you have a value that is -42.0000018258128512, I can save this value as a whole number by multiplying it by some value.

For example

warn(math.log10(52), 10^math.log10(52)) 1.7160033436347992 52.000001

Here we see the loss of precision you were mentioning, at this point it would be so small that you can math.floor(x)

Another idea is to convert the number to a string and save it as such. For example, if we have 95, tostring would save it as a char, then on runtime we tonumber the value. I will say, the larger the number is the more storage it would take up.

Idk if this what you are looking for, but the first thing I thought of after reading sandbox building is this video he makes good videos on placement system and datastore.

1 Like

I think I am confused on how either of these methods would help with precision loss. Is not the precision loss occurring because the CFrame values being saved each time are never precise enough?

Even if I store them as logs or strings wouldn’t there still be precision loss as they are saved over and over again since they have to be converted back to regular numbers when loaded?

I suspect the reason why this is happening to you is this because your CFrame values were already floats (not integers) to begin with; this is an issue with your sandbox implementation.

It doesn’t help that floating point errors are a thing.

This is a pretty tricky problem to solve (especially considering datastores store all your data in strings).

An easier solution here though, is to throw all those problems under the rug, by implementing sharding - you basically “split a player’s data” into shards, which are stored in different datastores accordingly.

Now granted, your players will take a bit longer to load in since there’s now more datastores to read from, but that beats having to deal with builds going off by x amount of studs every load-in.

This is redundant by the way, since datastores will automatically cast all data into strings prior to storage.

You are right otherwise though, floats-as-strings do take up a lot more storage.

1 Like

You’re right, CFrames are always floats. The only thing I would be concerned about with using multiple datastores would be how often they can be read/written to. Because the game already has multiple datastores (for saving more than 1 base), I am afraid of hitting read/write limits if I am using so many of them at once.

If you want to proceed with sharding, it may be better to also implement caching.

The idea here is to “get data from the datastores” once, cache it into a ModuleScript, make all your data changes in that cache, and then only save it into datastores when it is done (eg, the player leaves).

You may not even need multiple datastores by the way - you can shard data into several keys under the same store.

That being said, datastore limits are really generous when it comes to how much data you can store in a single key - 4,194,304 characters per key to be exact.

Assuming the serialized data that each part would have would take up 150 characters (a rather highball estimate), that’s about 27962 parts you can store in a single key.

Optimize your serialization well enough, and you can comfortably hit 30k per key.

Plenty lots of parts you can store in a single datastore key, I think.

I am able to store a generous amount of parts within a datastore, but what tends to take up the most space is not parts but objects such as text labels, decals, meshes, etc that have a lot of properties that need to be saved. I’ve actually done a lot to optimize serialization, including just not storing things that can be easily recreated with scripts and storing property names/contents with as few characters as possible.

I have noticed that players with a lot of complex objects can hit the limit, which is extremely rare but possible. I was really hoping to find a way to deal with these large CFrame values as they can still take up a reasonable amount of storage by being unrounded. Maybe there is something wrong with how I was rounding them and then loading them in, I’m not exactly sure. But I do know precision loss is occurring for some reason and leaving them unrounded seems to prevent it for the most part.

Rounding a float to any lower floating point will always result in precision loss, and as I mentioned, floating point errors are a thing.

It’s overall a pretty wacky problem to try to fix, and honestly you’ll spend more time doing that than just implementing sharding and data caching.

You kind of just solved your own issue (which no offence, is just another xy problem all over again) - just don’t round your CFrames.

From a player’s perspective, I’d rather deal with longer load times than my builds going slightly off the mark that only gets worse every load.

If you want to shrink the amount of data that things like TextLabels, Decals, or Textures would take up, consider compressing their Text or ContentId properties.

Also consider limiting the amount of “special parts” a player can have - such as TextLabels.

Yea I was afraid that there would be no good way to get around this and it makes sense now. I will for sure look into utilizing multiple datastores, even using another to double the current capacity would be more than sufficient.

Compression would help too, I haven’t looked into that much. I’ve also started realizing that the more customization that objects have the more properties that will have to be saved, so I’ve been carefully cutting back on them and only keeping what’s necessary.

I really appreciate your help!

Speaking of by the way, compression is exactly how you shrink your data usage.

In other words, you do not change the values themselves - you change how it’s “represented” as data within your datastores.

Ignoring the fact that this is just an xy problem (in that your problem isn’t exactly how to shrink CFrames, it’s how to efficiently store large amounts of data), you’ve been going about this problem the wrong way.

Compress, not mutate.

I just compressed a table of data I have been storing (so far just all parts) and it now uses 90% less storage. It went from 50% to only 5%! This is huge, I may not even need another datastore! Thanks so much for the idea, I had no idea how easy it was to use an open source compression module!

1 Like

… sandbox building game based on a simple grid system …
… is there a more efficient or better way to store part CFrame or position/orientation values in a datastore?

My first question is: do you need to store the CFrame?

If all placed objects are grid-aligned:

  1. Do you allow placement along the Y axis? If not, you can throw that out and just store the X/Z grid indices
  2. If user are able to rotate objects …
    • Are they able to rotate them in true 3D space, or are they only able to rotate a specific axis e.g. yaw?
    • If it’s the latter: only save that axis, and only do so if that value is greater than 0

etc etc


If you do need to store the CFrame:

Alongside using some lossless compression algorithm like those found here, you could also compress the CFrame itself by considering the matrix as a Vector3 and a Quaternion, e.g.:

Example

Note: Roblox doesn’t yet support binary data to be saved to the DataStore, as discussed here, so you would have to encode the buffer as Base64 before saving it e.g. this repo

local LP_EPSILON = 1e-6
local I16_PRECISION = 32767

-- i.e flags for CFrame datatype, learn more:
--        - https://www.alanzucconi.com/2015/07/26/enum-flags-and-bitwise-operators/
--        - https://turnerj.com/blog/fun-with-flags-enums-and-bit-shifting

local sigFlags = {
                                      -- ---------------------
                                      -- | BINARY  | Decimal |
                                      -- ---------------------
  None        = 0,                    -- | 0000000 |       0 |
  Coordinate  = 1,                    -- | 0000001 |       1 |
  Positional  = bit32.lshift(1, 1),   -- | 0000010 |       2 |
  Rotational  = bit32.lshift(1, 2),   -- | 0000100 |       4 |

  X           = bit32.lshift(1, 3),   -- | 0001000 |       8 |
  Y           = bit32.lshift(1, 4),   -- | 0010000 |      16 |
  Z           = bit32.lshift(1, 5),   -- | 0100000 |      32 |

  Component   = bit32.lshift(1, 6),   -- | 1000000 |      64 |
  --                                     ---------------------
}

-- i.e. maps a sigFlag to a quaternion index, i.e. 1 = x; 2 = y, 3 = z
local quatIndex = {
  [sigFlags.X] = 1,
  [sigFlags.Y] = 2,
  [sigFlags.Z] = 3,
}

-- i.e. maps a quaternion index (1 = x) to a sigflag
local quatRelative = {
  [1] = sigFlags.X,
  [2] = sigFlags.Y,
  [3] = sigFlags.Z,
}

-- e.g. some custom datatype you could define
local customDatatypes = {
  -- e.g. some custom datatype(s)
  GridAlignedData = 'table', -- blahblah
}

--[=[
  Computes a normalised quaternion given a CFrame

  @param cf CFrame      -- a CFrame value

  @return varargs<...>  -- normalised components of a quaternion
]=]
local function computeNormalisedQuaternion(cf)
  local axis, angle = cf:ToAxisAngle()

  local x = axis.X
  local y = axis.Y
  local z = axis.Z

  local d = x*x + y*y + z*z
  if d > LP_EPSILON then
    d = 1 / math.sqrt(d)

    x *= d
    y *= d
    z *= d
  else
    x = 1
    y = 0
    z = 0
  end

  local ha = angle*0.5
  local sha = math.sin(ha)
  local w = math.cos(ha)
  x = sha * x
  y = sha * y
  z = sha * z

  d = x*x + y*y + z*z + w*w
  if d > 0 then
    d = 1 / math.sqrt(d)
    return x*d, y*d, z*d, w*d
  end

  return 0, 0, 0, 1
end

--[=[
  Compress a quaternion into { 1x u8, ... 3x i16 }

  @param qx number      -- x component of a quaternion
  @param qy number      -- y component of a quaternion
  @param qz number      -- z component of a quaternion
  @param qw number      -- w component of a quaternion
  @param qw boolean     -- (optional) boolean to det. whether we will
                           attempt to compress the quaternion such that
                           its qi component represents a quaternion's index
                           if its value is approx. 1

  @return varargs<...>  -- the compressed quaternion components
]=]
local function compressQuaternion(qx, qy, qz, qw, useMinimal)
  local index = -1
  local value = -math.huge

  local element, v0, v1, v2
  for i = 1, 4, 1 do
    local val = select(i, qx, qy, qz, qw)
    local abs = math.abs(val)
    if abs > value then
      index = i
      value = abs
      element = val
    end
  end

  -- i.e. if 1x component is approx. 1 then we can reduce to single u8; since rotation (w, x, y, z) = -1*(w, x, y, z)
  if useMinimal and math.abs(1 - value) < LP_EPSILON then
    return index + 4
  end

  local sign = element >= 0 and 1 or -1
  if index == 1 then
    v0 = math.floor(qy * sign * I16_PRECISION + 0.5)
    v1 = math.floor(qz * sign * I16_PRECISION + 0.5)
    v2 = math.floor(qw * sign * I16_PRECISION + 0.5)
  elseif index == 2 then
    v0 = math.floor(qx * sign * I16_PRECISION + 0.5)
    v1 = math.floor(qz * sign * I16_PRECISION + 0.5)
    v2 = math.floor(qw * sign * I16_PRECISION + 0.5)
  elseif index == 3 then
    v0 = math.floor(qx * sign * I16_PRECISION + 0.5)
    v1 = math.floor(qy * sign * I16_PRECISION + 0.5)
    v2 = math.floor(qw * sign * I16_PRECISION + 0.5)
  elseif index == 4 then
    v0 = math.floor(qx * sign * I16_PRECISION + 0.5)
    v1 = math.floor(qy * sign * I16_PRECISION + 0.5)
    v2 = math.floor(qz * sign * I16_PRECISION + 0.5)
  end

  return index, v0, v1, v2
end

--[=[
  Compress a quaternion into { 1x u8, ... 3x i16 }

  @param cf CFrame      -- a CFrame value
  @param qw boolean     -- (optional) boolean to det. whether we will
                           attempt to compress the quaternion such that
                           its qi component represents a quaternion's index
                           if its value is approx. 1

  @return varargs<...>  -- the compressed quaternion components
]=]
local function compressQuaternionFromCFrame(cframe, useMinimal)
  local qx, qy, qz, qw = computeNormalisedQuaternion(cframe)
  return compressQuaternion(qx, qy, qz, qw, useMinimal)
end

--[=[
  Attempts to derive a valid quaternion from a compressed quaternion
  as given by the `::compressQuaternion` and `::compressQuaternionFromCFrame` method

  @param ... varargs<...> -- compress components

  @return varargs<...>  -- the x, y, z and w components of the inflated quaternion
]=]
local function decompressQuaternion(qi, q0, q1, q2)
  -- decompress minimal where index describes one q component that's 1
  if qi > 4 then
    if qi == 5 then
      return 1, 0, 0, 0
    elseif qi == 6 then
      return 0, 1, 0, 0
    elseif qi == 7 then
      return 0, 0, 1, 0
    elseif qi == 8 then
      return 0, 0, 0, 1
    end
  end

  -- decompress components
  q0 /= I16_PRECISION
  q1 /= I16_PRECISION
  q2 /= I16_PRECISION

  local d = math.sqrt(1 - (q0*q0 + q1*q1 + q2*q2))
  if qi == 1 then
    return d, q0, q1, q2
  elseif qi == 2 then
    return q0, d, q1, q2
  elseif qi == 3 then
    return q0, q1, d, q2
  end

  return q0, q1, q2, d
end


----------------------------------------------
--                                          --
--                  EXAMPLE                 --
--                                          --
----------------------------------------------

--[[

  [!] Note:

    The 'writeExampleBuffer' could do with some work here:

      1. you should be iterating through the values and determining the len of your
         buffer first before creating it

    OR;

      2. You should be dynamically resizing your buffer

    The same can be said for the read example which could be expanded to
    read a u8 that would describe the datatype; the reader would read that flag
    before decoding the value

]]

------

--[=[
  Attempts to write the given value to the buffer according to its datatype

  @param datatype string -- the desired datatype
  @param value    any    -- the value to write

  @return (1) boolean         -- reflects success state
          (2) buffer | string -- either (a) the resulting buffer or (b) an error message

]=]
local function writeExampleBuffer(datatype, value)
  local t = typeof(datatype)
  if t ~= 'string' then
    return false, string.format('Expected string for DataType but got %q', t)
  end

  local trueDataType = customDatatypes[datatype] or datatype
  t = typeof(value)

  if datatype ~= t then
    return false, string.format(
      'TypeMismatch: expected value as type %q but got %q',
      datatype, t
    )
  end

  local buf
  if datatype == 'CFrame' then
    local vector = value.Position
    local qi, q0, q1, q2 = compressQuaternionFromCFrame(value, true)
    local hasPositionalData = vector.Magnitude >= LP_EPSILON
    local hasRotationalData = qi <= 4

    local sg, sz
    if hasPositionalData and hasRotationalData then
      sg = sigFlags.Coordinate
      sz = 19
    elseif hasPositionalData then
      sg = sigFlags.Positional
      sz = 13
    elseif hasRotationalData then
      sg = sigFlags.Rotational
      sz = 7
    else
      sg = sigFlags.None
      sz = 1
    end

    local index
    if qi > 4 then
      index = quatRelative[qi - 4]

      if index then
        index = bit32.bor(index, sigFlags.Component)
      end
    else
      index = quatRelative[qi]
    end

    if index then
      sg = bit32.bor(sg, index)
    end

    local offset = 1
    buf = buffer.create(sz)
    buffer.writeu8(buf, 0, sg)

    if hasPositionalData then
      buffer.writef32(buf, offset + 0, value.X)
      buffer.writef32(buf, offset + 4, value.Y)
      buffer.writef32(buf, offset + 8, value.Z)
      offset += 12
    end

    if hasRotationalData then
      buffer.writei16(buf, offset + 0, q0)
      buffer.writei16(buf, offset + 2, q1)
      buffer.writei16(buf, offset + 4, q2)
      offset += 6
    end
  else
    -- e.g. some other datatype

  end

  if buf then
    return true, buf
  end

  return false, string.format('NotImplementedError: unknown datatype %q', datatype)
end

--[=[
  Attempts to read the given datatype from the given buffer at the given offset

  @param buf      buffer -- the buffer to read from
  @param datatype string -- the desired datatype to read
  @param offset   number -- (optional) buffer offset index

  @return (1) boolean         -- reflects success state
          (2) buffer | string -- either (a) the resulting value or (b) an error message
          (3) number          -- the offset after reading the value
]=]
local function readExampleBuffer(buf, datatype, offset)
  local t = typeof(buf)
  if t ~= 'buffer' then
    return false, string.format('Expected BufferObject for buffer but got %q', t), offset
  end

  t = typeof(datatype)
  if t ~= 'string' then
    return false, string.format('Expected string for DataType but got %q', t), offset
  end

  t = typeof(offset)
  if t == 'nil' then
    offset = 0
  elseif t ~= 'number' then
    return false, string.format('Expected number|nil for offset but got %q', t), offset
  end

  if datatype == 'CFrame' then
    local sig = buffer.readu8(buf, offset + 0)
    offset += 1

    local coordinate
    local x, y, z
    local qi, q0, q1, q2
    local qx, qy, qz, qw

    local isComponent = bit32.band(sig, sigFlags.Component) == sigFlags.Component
    if isComponent then
      sig = bit32.band(sig, bit32.bnot(sigFlags.Component))
    end

    if bit32.band(sig, sigFlags.Coordinate) == sigFlags.Coordinate then
      -- some coordinate frame incl. position + rotation
      sig = bit32.band(sig, bit32.bnot(sigFlags.Coordinate))
      qi = sig > 0 and quatIndex[sig] or 4
      qi = isComponent and qi + 4 or qi

      x = buffer.readf32(buf, offset + 0)
      y = buffer.readf32(buf, offset + 4)
      z = buffer.readf32(buf, offset + 8)

      q0 = buffer.readi16(buf, offset + 12)
      q1 = buffer.readi16(buf, offset + 14)
      q2 = buffer.readi16(buf, offset + 16)

      qx, qy, qz, qw = decompressQuaternion(qi, q0, q1, q2)
      coordinate = CFrame.new(x, y, z, qx, qy, qz, qw)
      offset += 18
    elseif bit32.band(sig, sigFlags.Positional) == sigFlags.Positional then
      -- some positional coordinate +/- rotation
      sig = bit32.band(sig, bit32.bnot(sigFlags.Positional))
      qi = sig > 0 and quatIndex[sig] + 4 or 8

      x = buffer.readf32(buf, offset + 0)
      y = buffer.readf32(buf, offset + 4)
      z = buffer.readf32(buf, offset + 8)

      qx, qy, qz, qw = decompressQuaternion(qi)
      coordinate = CFrame.new(x, y, z, qx, qy, qz, qw)
      offset += 12
    elseif bit32.band(sig, sig) ~= sigFlags.None then
      -- some rotational coordinate
      if bit32.band(sig, sigFlags.Rotational) == sigFlags.Rotational then
        sig = bit32.band(sig, bit32.bnot(sigFlags.Rotational))
        qi = sig > 0 and quatIndex[sig] or 4
        qi = isComponent and qi + 4 or qi
      else
        qi = sig > 0 and quatIndex[sig] or 0
        qi = isComponent and qi + 4 or qi
      end

      if qi <= 4 then
        q0 = buffer.readi16(buf, offset + 0)
        q1 = buffer.readi16(buf, offset + 2)
        q2 = buffer.readi16(buf, offset + 4)
        offset += 6
      end

      qx, qy, qz, qw = decompressQuaternion(qi, q0, q1, q2)
      coordinate = CFrame.new(0, 0, 0, qx, qy, qz, qw)
    else
      -- identity
      coordinate = CFrame.identity
    end

    return true, coordinate, offset
  else
    -- e.g. some other datatype

  end

  return false, string.format('NotImplementedError: unknown datatype %q', datatype), offset
end


----------------------------------------------
--                                          --
--                    DEMO                  --
--                                          --
----------------------------------------------

-- e.g. some wrapper function around your read/write methods ...
local function someWrapperFunction(target)
  local success, result = writeExampleBuffer('CFrame', target)
  if success then
    success, result = readExampleBuffer(result, 'CFrame', 0)
    if success then
      return true, { Target = target, Result = result }
    end
  end

  return false, result or 'Unknown error occurred'
end

-- [?] Note:
--    The following is wrapped in a function because I tested it using a storybook,
--    just rip out the contents and use it in some script to test it
--

return function ()
  local exampleCFrame = CFrame.new(0, 5, 0) * CFrame.Angles(math.pi*0.25, 0, -math.pi*0.25)

  local success, result = someWrapperFunction(exampleCFrame)
  if not success then
    warn(result)
    return
  end

  local part = Instance.new('Part')
  part.Size = Vector3.one + Vector3.zAxis
  part.Shape = Enum.PartType.Block
  part.CanCollide = false
  part.CanQuery = false
  part.CanTouch = false
  part.Anchored = true

  local parts = { }
  for name, cframe in next, result do
    local p = part:Clone()
    p.Name = name
    p.CFrame = cframe
    p.BrickColor = BrickColor.Random()
    p.Parent = workspace
    table.insert(parts, p)
  end

  return function ()
    for i, v in next, parts do
      pcall(v.Destroy, v)
    end
  end
end

Some things to note in this example:

  • Storing a CFrame naively in a buffer would cost 48 bytes if you wrote the components as float32;

  • We can reduce that a little more by storing its position as a Vector3 via 3x float32 and either (a) 3x float32 for the euler angles, or (b) its axis angle representation which would be 4x float32 - this gets you to 24 - 28 bytes respectively

  • The following example is dynamically sized:

    • In the best case, e.g. when storing an identity quaternion, it would be represented as 1 byte
    • In the case of a CFrame containing only rotational data, it would be presented as 7 bytes
    • In the case of a CFrame containing only positional data, it would be represented as 13 bytes
    • In the worst case, where a CFrame contains both positional and rotational data, it would be represented as 19 bytes
  • This means that - compared to the naive method - in the best case we’re achieving space saving of ~97.92% and in the worst case, we’re at ~60.42%; i.e. compression ratio of 2.08% - 39.58%


P.S. You can do a similar thing with the grid-aligned data if you’re only considering (x, z, yaw):

  1. Encoding yaw:

    • Map the yaw to -180 to 180
    • Write it as an int8
  2. Encoding X / Z:

    • Since it’s grid-aligned, we can store it as an integer
    • Assuming the players can’t place items at any position:
      • You could compute the grid-aligned position relative to the placement origin, e.g. they can only place in a 100x100 square, mapped to the grid indices - this can be further reduced by storing it as one integer where the index would now be: index = x + width*y
      • Stored as an integer of an appropriate size, e.g. uint8 would give you a range of 0 - 255 or a uint16 of 0 - 65,535
1 Like

HEY!!! THANK YOU SO MUCH!!! The video you linked was helpful, but he linked to a separate video that was EXACTLY what i was looking for (data compression via encoding). I cannot thank you enough for sharing this!!

1 Like