What's 'utf8' and is it useful?

Hey. Lately I’ve seen there’s ‘utf8’ function in Roblox, I heard of UTF-8 but I never really knew what it is.
Can you explain to me what it is, what it’s useful for and why it’s useful?

Thanks.

1 Like

I think it’s a character set kinda like Unicode.

1 Like

Oh. But is it that useful so I should understand it or it’s just if I want to improve my knoweldege?

According to the API reference it is for UTF-8 encoding.

UTF-8 is just one of the encoding schemes on the internet, although you don’t need to about it it’s sort of like a map of different characters assigned to different values to allow for compatibility with ASCII.

I wouldn’t worry too much about it unless you need to use encoding for transferring outside of Roblox.

Oh. I understood, thank you for your clear explanation! :slight_smile:

There are many character encoding formats. It’s basically to encode a characters into bytes, then, the receiving end can decode the bytes back into characters using the same format. It’s how data can be sent such as “Hello” over a network. You can’t send “Hello” but you can sent the byte equivalent of that message which is something like this: 011101110001000

Since all computers are fundamentally running machine code (Zero and One), any machine can receive the message through bytes. Then, the chosen encoding format can also decode the bytes back into readable characters.

The reason why there are different types of formats, such as ASCII and UTF-8 is simply the way technology advances. ASCII can only encode character from 0 - 9, A to Z and a few other special characters (@,#,$, etr…) while UTF8 can encode and decode 2,164,864 different characters.

The 8 in UTF-8 means that it has 8 bytes to work with. It can encode 2,164,864 characters with 8 bytes of memory.

If a machine encodes a message with UTF-8, but the receiving machine decodes with ASCII. The message won’t be the same since they both use different algorithms to encode and decode a message into bytes (011010001)

Ascii can only encode 128 characters. Which is an older encoding format.

3 Likes

Okay. Thank you for your explanation! :slight_smile:

You can learn about it if you want to, but you don’t have to. You can change the character sets in Notepad, so try seeing the differences between UTF-8 and Unicode, and all the other options.