Generating an entire audio file is very expensive, if Roblox uses 44.1k audio (i dont think it uses 48k), thats 44,100 samples per second, usually, these samples are 2 bytes wide for 2 stereo channels, so, for one second, you need to write 176,400 bytes.
Obviously, one optimisation is a ring buffer approach, but its still very expensive to render audio in real time.