A byte is a sequence of 8 bits used to represent a value. A bit is a single binary digit, either a 1 or a 0. Therefore a byte can represent 256 unique values (2^8). To reiterate, all a byte fundamentally is is a 8-digit binary number, meaning any value between 0 and 255 (for a total of 256 values) can be represented with a byte.
In Lua, string.byte
returns the character code represented by the given character. In general (although not required), these values will map to what you see here.
local str = "ABC"
-- print the character codes corresponding to A, B, and C
print(str:byte(1, 3)) -- 65 66 67
I assume when you’re referring to “funky text” you’re referring to the representation of these bytes in hexadecimal (base 16). The convention of representing bytes with hexadecimal digits is widespread among programmers. The reason is that converting between binary and hexadecimal is trivial and compact as every 4 bits can be represented with a single hexadecimal digit.
Positional Numeral Systems
To generalize, the notion of these “base-n” systems: they are purely just alternate ways to represent values. The base defines the places (each place is a power of the base) as well as the number of unique symbols you need to represent numbers. For example, we’re accustom to base-10 (decimal) in which all the places are powers of 10s and we have 10 unique symbols to represent numbers with (conventionally with 0, 1, 2, …, 9). In binary, the places are powers of 2s and there are 2 unique symbols to represent numbers (conventionally with 0 and 1). In hexadecimal, we use the symbols we use in decimal but extend it with 6 more symbols (0, 1, 2, 3, 4, 5, 6, 7, 8, 9, A, B, C, D, E, F).
The hexadecimal digit 41
, for example, represents the value “4 16s and 1 ones,” which in decimal is equivalent to 65
(which happens to represent A
in the table shown above).
Strings
In many languages, strings are merely just a sequence of characters where a character is often a single byte (this differs depending on the encoding you decide to go with (i.e. UTF-16 uses at least 2 bytes per character)). For the sake of Lua though, it is treated as 1 byte.
The string “ABC”, then, can be decomposed into 3 characters, which for our sake is just bytes. A
is encoded as 65
(or 0x41
where 0x
is a convention specifying the hexadecimal system), B
as 66
, and C
as 67
. Lua allows you to build strings using character codes directly by using the escape character, for instance:
local str1 = "\65\66\67"
local str2 = "ABC"
print(str1 == str2) -- true
Uses
There are many uses that come from the understanding as to how strings are represented. One very popular use case is the Base64 encoding scheme
which is occasionally used to serialize binary data
into text such that it can be saved into DataStores, transferred over HttpService, etc. Another use does in fact come from encryption. A simple cipher (named ROT13
) operates by “offsetting” each character in your string by 13
(and wrapping the characters around if they pass the end). More complicated encryptions also generally operate per-character. Another very popular use cases is hashing.
The purpose of hashing is to create a one-way function to transform a string into a hash. The usage of this comes from storing passwords (hashing passwords generally guarantees it cannot be reversed, but you can hash the user-supplied password and compare the hash of this to what is stored to confirm if it is the “correct” password). Most hashing functions operate on the byte-level,that is some algorithm (be it a sequence of sums and modulos or bitwise operations or a combination of both) is used to create a hash.