Maybe slightly off topic but text is actually really easy to compress.
I think they just compress comments when it’s stored on servers.
If I’m not mistaken platforms like Youtube, Twitter and everything else that exists can theoretically just look at comments posted by people that have identical text and words and de-duplicate it.