Another use case: Being able to dynamically equalize the volume of sounds that have too low/too high amplitude (for example, on a custom ID radio, reducing the volume if someone decides to play an ear-killing song, or making a song louder if it was uploaded too quietly)
@all, there are a lot of small details to work out.
For example, you say “volume of a sound heard by the client, in dB”. Is this actually what you want? For example, if the sound is distant should the returned value be lower? (I like the idea of the returned value being in dB. Not sure if “Amplitude” is the best term.)
For a visualizer, you probably want to have some plugin that generates quantiles of the sound. This would mostly remove then need to have a logarithmic scale or to know peak values. eg, you would find 4 values – “20% of input has a lower value”, “40% is lower than this”, “60% is lower”, and “80% is lower”. From there you can quantize the sound into 5 levels – 0-20%, 20%-40%, 40%-60%, 60%-80%, 80%-100%. It should be a good starting point as each condition will be met for equal amounts of time. (you can come up with other heuristics as well.)
sound.Frequency is also possible, but can be confusing. It would likely be the dominant FFT bin or some similarly crude frequency estimator. I’m not sure if we would need to re-calculate this. This one might need extra implementation effort if it isn’t a builtin feature of fmod.
For both, there would also need to be an associated time range. For example 1/60th, 1/30th, 1/10th of a second, etc… Otherwise you could choose a sampling point near zero, even for loud sounds.
imho, no–3D and 2D sounds should be treated the same. Users can factor in attenuation to suit their needs, but factoring out any built-in attenuation would be harder (e.g. must account for ListenerType).
That’d be cool, and I think dominant frequency is built into fmod. People could do more creative things with more detailed fft info.
In case anyone wants to make a visualizer-helper plugin:
Another idea would be to make an iterative map of sound level. This is a 2d scatter plot of x[n+1] vs x[n]. A simple threshold becomes a square. This would allow you to set levels so you have squares with 20%, 40%, etc… of the points. This would give you simple levels where you maximize transitions during the playback while having equal on-time for each level.