Changing rate to cut filesize?

**Springheel** · January 17, 2013

I did this by unpacking all of the sound archives and re-encoding them at drastically lower bitrates. Many of the background musics are at 499 kb/s and many of the vocals were at 239 kb/s. This was WAY overkill, so I brought it down to like 112 and 96 respectively.

Can you tell me a bit more about this? Is there really no noticeable difference when you downrate them? I'm all for shrinking sizes of files if possible.

lost_soul · January 17, 2013

Yep, quality in lossy formats is a subjective thing, but all my music is at 160 kb/s OGG and it sounds great. I reasoned that I would have a lot harder of a time hearing artifacts when there are lots of sounds being played at the same time. (music, wind, vocals, etc).

Also, If you look in the Doom 3 paks, the audio is super low bitrate... somewhere around 24 kb/s. I personally think that was going too far and you could easily hear the compression, but I understand why they did it because so many machines had half a gig of memory back then.

Edited January 17, 2013 by lost_soul

demagogue · January 17, 2013

A computer like that isn't great for playing, but I had a little netbook with only like 1 or 1.2GB and while playing it was basically hopeless, I could still work on a map on DR on it and at least open it up in-game to check for leaks or bugs. It made me happy to map on it because it was so tiny & I could take it out to like a cafe easily or whatever. 2GB is closer to possible for playing some maps though. Anyway good luck in your quest. I think for anybody that can hack a game to play on any system, and it makes them happy, more power to them.

Tels · January 17, 2013

Yep, quality in lossy formats is a subjective thing, but all my music is at 160 kb/s OGG and it sounds great. I reasoned that I would have a lot harder of a time hearing artifacts when there are lots of sounds being played at the same time. (music, wind, vocals, etc).

Also, If you look in the Doom 3 paks, the audio is super low bitrate... somewhere around 24 kb/s. I personally think that was going too far and you could easily hear the compression, but I understand why they did it because so many machines had half a gig of memory back then.

What's the actual rate of our voice files? Do we even have a standard?

Re-encoding them from the WAVs should be easy and is non-lossy (you could just redo it). Maybe we can even have a high-quality audio PK4, that people can manually download if they have the memory for it.

SeriousToni · January 17, 2013

Please don't do encode your sound files to a low-bit-format. For users that play on stereo speakers that may be okay but for the people with headsets it is a pain to hear 128kbits/s sounds.

Sotha · January 17, 2013

What's the actual rate of our voice files? Do we even have a standard?

http://wiki.thedarkmod.com/index.php?title=Sound_File_Formats

Sotha · January 17, 2013

I did a quick encoding test:

http://www.sendspace.com/file/fi6kh5

q0.ogg 48kb/s

q1.ogg 60kb/s

q2.ogg 70kb/s

q3.ogg 80kb/s

q4.ogg 86kb/s

q5.ogg 96kb/s (the recommended)

I honestly cannot tell any subjective difference between the audio. Even the lowest encoding sounds exactly the same as the highest.

File size difference between the q0 and q5 (recommended): q0 is almost half of q5.

After this... I am sure nobody talks to Bikerdude like that.. :laugh:

**Springheel** · January 17, 2013

I know very little about the technical aspects of sound recording, so I'm relying on others' judgement here. But with the memory issues we're running into with 1.08 (which I'm sure has been at least partially caused by adding 200+ new impact sounds) I'm definitely interested in hearing whether there are cost-effective ways to cut down the filesize.

New Horizon · January 17, 2013

If the sounds aren't following our standards they should definitely be downgraded.

Bikerdude · January 17, 2013

q5.ogg 96kb/s (the recommended) After this... I am sure nobody talks to Bikerdude like that..

I only got what you were saying after I listened the the audio tracks and one of of my work collegues asked me what I was laughing at LOL

it is a pain to hear 128kbits/s sounds.

Thats is all related to the sound format used, in the case of .OGG its superior to .MP3 at the same bitrate. Bit where OGG excels is at lower bitrate/s, as per sotha's test 96Kbps is comparable to 128Kbs MP3.

Serpentine · January 17, 2013

Just remember that the sample rate supported is fairly static on the decoder side of things. Adjusting the bitrate however in cases where it's pointlessly high is not a problem afaik - you just have the possibility of adding aditional transcoding artifacts - meaning that batch processing can lead to unexpected consequences. Remember also that the ogg's are decoded to pcm at runtime, so only the raw size really matters.

**Springheel** · January 17, 2013

Any chance of getting that in English? :huh:

grayman · January 17, 2013

Any chance of getting that in English?

Obsttorte · January 17, 2013

I think it is a good idea in generally to downsize files if possible. In reference to sound files I think it really comes down to what exactly we are talking about. The "pain in the ear" effect mentioned by SeriousToni should only be there for ambients and such things (so real music). I guess downsizing collission sounds for example should not have such an extreme effect on quality.

Another thing is that it is a big difference if you are hearing the sounds in a media player (like really listening to them), or if you are hearing them in game. One try would be to create several pk4's with different rates so people can actually test it in game and then reply on how it does impact on specific kind of sounds.

Serpentine · January 17, 2013

Hmm, ok I'm no expert on audio stuff. We will make some assumptions and cover some likely over simplified/incorrect theory in mono!

Sounds are waves. We can draw nice graphs to represent them. Most interesting sound however is rather high frequency, and making pretty graphs of those waves is extremely hard(In the same way recording them with precession in real time would be), and to play back would require a lot of calculation. So we cant really use nice waves to store most natural sound on a computer.

Rather we take samples - Think of it like a record players head floating in the air, the sound a wave which it runs along. Our sample rate is how many times a second we look at how high/low the needle is in respect to the starting point. The difference is a point of data - it could be 7 away, then the next time 9... and so on.

This works quite well, however there is no way for us to know the true frequency of the wave. Should we be taking a sample every second - and at 0.5 seconds the sound does a little dip then comes back to where it was at 0.9s, we are none the wiser. If we sampled more frequently we would be able to catch more detail in this regard. In such a way, sample rate depends on what the wave you are trying to record is coming from, too low and you lose lots of detail, too high and you end up with lots of 'unnecessary' data points.

Now, that's fantastic and all - we've taken out samples and made a bit of a bargraph out of the wave. Playing this back would sound terrible - unless you had sampled at an insanely high frequency to capture the smallest details. This would eat up a load of storage space and throwing data around. This is more or less what raw sound is however.

We also need to take into consideration the sample's possible magnitude, we are recording these numbers, but to keep it all easy to work with. The problem is once again that sound is unpredictable, just as with graphics we have a bit depth. Just as we have 8 bits per colour channel for a pixel of colour (for example), we could use 16 or 24 bits for our audio sample.

So let's play this back! We take our raw samples, they sample rate is way above what we need to hear it accurately, the depth is good and has no clipping. We can jump around the recording easily. If we want to go to 23 seconds, we take our constant sample rate, figure where that starts and continue. However if we're storing this all in a computer, it helps to know where in memory that position is, so we take our sample rate, times it by our sample depth, and work out how many bits of information the offset it. Fast and efficient - but frigging huge. PCM includes the info about bitrates and such, where true raw does not. Just like TGA, working out PCM size is very simple and fast. This is hopefully what we store in our wav files (they can store other stuff too, just containers!).

Now we want to be more efficient with storing the audio, so we drop our sample rate to match the desired object - a phone call, human voice. This then makes us have more error in our already non-analogue form. Playing it back sounds pretty crap, so we add some filtering that takes those values and tries to smooth the output wave between samples. A good start. Then we decrease the bit depth. Now, just like pictures, dropping depth is sometimes fine, sometimes not. We now end up with a situation where two slightly different values are now the same, giving heavy distortion. So we tune the filtering as best we can, try a few tweaks, but it's not all that great.

Now rather than killing the quality, we rather look at changing the encoding. Storing that huge bar graph became a mission, so we rather start to just record at high quality, then process it in a way that takes consideration of the recording, but fits into our budget for storage space. First of all, with the whole recording we can eliminate extra bit depth, we recognize common features that we can store in a simplified way, but restore when we decode it again. However this encoding means that we're gonna need decoding later on to play it back. How do we play it back? we decode it back to our bar graph like state which re reconstruct as best we can and output it in raw. (Ok, some hardware can decode storage formats without software, but in this case - no.). Different encoding schemes are allowed in OGG, and they vary depending on what they are intended for, Vorbis will be our most common - and is a good general purpose one. However Speex for example is optimized for storing human speech, and as such uses far less information per second to store it, meaning a far lower bitrate.

So lower bitrate means lower information, but it does not always mean lower quality - since that information might be completely useless detail, encoders will allow you to specify your desired bitrate, then try to squeeze things to fit, where lots of trouble can happy... or conversely very little difference for a large decrease in size. Changing the samplerate means many things, but since your computer's sound system will require specific rates and depths, the engine expects the rather standard or easy to adjust values, but dislikes completely custom values(The engine should already moan about this - but still, for some stuff it could be dropped - but this is one which requires listening). The decoder in the engine makes a few assumptions on what values we can use, sample rates and such - as to make it fast - but at the end of the day, we're caching decoded PCM stuff and this all makes little difference besides for filesize.

Do it right the first time, I'd only look for clearly errant sizes otherwise.

tl;dr - wav/pcm = bitmap; ogg = jpeg. We internally use bitmaps anyway.

Edit : Oh hey, I forgot to open the search results I was going to just paste anyway, maybe a better overview: http://grahammitchell.com/writings/vorbis_intro.html

Obsttorte · January 17, 2013

And what is your opinion about this topic now? :blink:

**Springheel** · January 17, 2013

Ok, I need some basic things explained to me.

There is what Audacity calls the "project rate". This is generally 44100 Hz for our files, though D3 often used 22050. I believe I experimented with this in the past and found that changing it resulted in little benefit to file size.

What is the difference between this and a "bitrate"? How do I even identify the bitrate in Audacity?

Serpentine · January 17, 2013

You will only really encounter bitrate setting when you are exporting. There should be some advanced options there.

Project Rate is the internal sample rate. i.e if you're working with something you imported at 44.1k, and then import a 22k clip, you will need to resample it else one will play super quickly and sound stupid It should be 44100 for D3. Other values will cause trouble, specially if they're used in effects.

SeriousToni · January 18, 2013

They did this with the german Skyrim voice overs, but 95% of the people didn't hear it over their speakers. Only the ones that used a quality headset. So I'm very sceptic on this. But as long as there's a high quality (or standard quality) soundpack available for download I don't care if you shrink the sounds to sound like they were played out of a blank sheet can

Here are two videos showing the shitty comprimated german voice sounds of skyrim (the english ones worked nicely) - if you watch the video and don't notice anything odd, then well..

https://serioustoni.minus.com/mbm4MG1PdN

*mumbles like a grumpy old man* :laugh:

Tels · January 18, 2013

They did this with the german Skyrim voice overs, but 95% of the people didn't hear it over their speakers. Only the ones that used a quality headset. So I'm very sceptic on this. But as long as there's a high quality (or standard quality) soundpack available for download I don't care if you shrink the sounds to sound like they were played out of a blank sheet can

Here are two videos showing the shitty comprimated german voice sounds of skyrim (the english ones worked nicely) - if you watch the video and don't notice anything odd, then well..
https://serioustoni.....com/mbm4MG1PdN

*mumbles like a grumpy old man*

I'm hearing the bad voices with my 20€ headphones..so I hear your point

Sotha · January 18, 2013

How do I even identify the bitrate in Audacity?

When you are about exporting the sound file into .ogg, the big options button in the lower part of the export file name selection screen shows you a quality slider. This quality slider controls the bitrate.

From http://www.vorbis.com/faq/

For now, quality 0 is roughly equivalent to 64kbps average, 5 is roughly 160kbps, and 10 gives about 400kbps. Most people seeking very-near-CD-quality audio encode at a quality of 5 or, for lossless stereo coupling, 6. The default setting is quality 3, which at approximately 110kbps gives a smaller filesize and significantly better fidelity than .mp3 compression at 128kbps.

The philosophers can debate and debate and debate.

It needs one experimentalist to check how the thing likely is. Take a few sounds, encode in lower bitrates. Present for evaluation. Is there a discernible drop in quality or isn't there?

Did anyone of you über-ears ( ) hear any differences in the files I produced? The only difference I'm detecting is the file size.

STiFU · January 18, 2013

What is the difference between this and a "bitrate"? How do I even identify the bitrate in Audacity?

The sampling rate defines how many samples of a signal are recorded or played per second. It needs to remain at 44.1 kHz because we can hear frequencies in the Range of 20 Hz to 20 kHz and the sampling rate must be at least twice as high as the biggest signal frequency (i.e. 20 kHz) to avoid aliasing artifacts or untrue reproduction of the signal. You can only use a lower sampling rate if the sourcesignal does only have low frequency compenents. You could for example try to apply a lowpass filter to your audiosignal with a cut-off frequency of 10 kHz. If your audio file still sounds fine after that, you may downsample to a sampling rate of 22 kHz.

PCM is the lossless direct encoding of a signal. This is what we know as a wav-file.

The Bitrate you can adjust in an audioexporter is the bitrate of the lossy encoded audio file. A low bitrate does not neccessarily mean low quality as the perceived quality is signal dependent. A sine for example can be encoded at VERY low rates and still sound perfect.

Re-encoding them from the WAVs should be easy and is non-lossy (you could just redo it). Maybe we can even have a high-quality audio PK4, that people can manually download if they have the memory for it.

I don't know what you're getting at here, but audioencoders besides PCM are lossy.

7upMan · January 18, 2013

Man, that voice actress must have some really crappy recording equipment. I know for a fact that many voice actors have their own home studio (if you can call it that), so she must be either a real cheapskate or someone without proper training on how to set up your audio equipment. However, the audio technician at Bethesda doesn't seem to have a clue as well. This very much resembles my experience with the German version of Oblivion. <_<

Tels · January 18, 2013

Tels, on 17 January 2013 - 07:59 AM, said:

Re-encoding them from the WAVs should be easy and is non-lossy (you could just redo it). Maybe we can even have a high-quality audio PK4, that people can manually download if they have the memory for it.

I don't know what you're getting at here, but audioencoders besides PCM are lossy.

What I meant was that if we encode from our WAV to OGG and it sounds shitty, we just encode it again with a higher bitrate - there is no loss for us as we still have the original WAV. And even if the result is slighly sub-par, we can still make the current files available as "high-quality sound set" for people with enough RAM in their PC.

STiFU · January 18, 2013

Oh yeah of course. You may never reencode already lossy encoded audio. Always work from the base wav file!

Sign In

Changing rate to cut filesize?

Recommended Posts

Springheel

lost_soul

demagogue

Tels

SeriousToni

Sotha

Sotha

Springheel

New Horizon

Bikerdude

Serpentine

Springheel

grayman

Obsttorte

Serpentine

Obsttorte

Springheel

Serpentine

SeriousToni

Tels

Sotha

STiFU

7upMan

Tels

STiFU

Join the conversation

Recent Status Updates

Browse

Activity