forum

[Proposal] More specific rule for .ogg files used as audio for the beatmap

posted
Total Posts
20
Topic Starter
Shoenen
Section: General Ranking Criteria, Audio Rules.

Current rule: "A beatmapset's audio file must use the .mp3 or .ogg file format and have an average bit rate no greater than 192kbps."

Rule modification proposal: "A beatmapset's audio file must use the .mp3 or .ogg file format and have an average bitrate no greater than 208 kbps."
(Added the bold to make more clear where the change is).

Reason:

TL;DR:
Due to how .ogg files work, it's not possible to use a fixed kbps. 208kbps is the average bit rate that will allow us to use both .ogg VBR6 and .mp3 VBR2.

Poem: (Will be also a little bit technical)
Please read the whole discussion to know how the proposal rule ended up being what i wrote earlier. Thank you Naxess and Keitaro!

*IMPORTANT*:
This section was written with my previous rule proposal in mind, which was to allow VBR6 for .ogg and keep 192kbps for mp3s. This is why i've not included .mp3 VBR2 here. However .mp3 VBR2 behaves similarly to .ogg VBR6.


Premises: I'm going to use Audacity to export the audios i'm going to use for demonstration, all of them are compressed from a .flac file, which is the CD quality lossless, meanwhile .ogg and .mp3 are lossy format.
.ogg files are exported using VBR6 quality setting, .mp3 files are exported using CBR 192kbps. The libraries used to export the audios are the one given from audacity, i haven't done any change to them or used any kind of custom library.

I'll start saying that .ogg files can't be exported at a target average bitrate by default, in fact you can read the official FAQ response from the autors of the format themselves at https://xiph.org/vorbis/faq/#quality .
To be clear, i'm not saying that is not possible to export to a target average bitrate (in TL;DR i've done a simplification), however to do so you'll have to actually do very tricky and complex stuff, messing with the libraries and it won't produce a perfect result anyway, as stated on the FAQ. And all of this stuff is out of reach for the majority of people, actually prolly only the IT people will know how to do this. And asking to have a degree in Computer Science to export an audio for osu it's kinda meh in my opinion.

Now, let's move to the actual reason why VBR6 should be the highest quality and why, but before this i'll briefly talk about how .ogg compresses, which is totally different from how mp3 compresses.
First of all, .ogg is NOT how the audio is encoded, the audio is encoded with Vorbis, OGG is the container for it. So meanwhile .mp3 is an audio format file, .ogg is a container format file (IT people don't be mad at me, i'm oversimplifying). That's the first main difference, the second one is that the "Quality Setting" for .ogg encoding (VBR-1 to VBR10, integers only) produces different results depending on the song, because of how the algorythm interprets it. (I'll gives examples later)

Knowing this, using average bitrate to compare .mp3 and .ogg isn't a wise choice and will only bring more troubles. I suggest using the actual file weight to compare them (Weights of a CBR 192kbps .mp3 and a VBR6 .ogg). This is done according to what Peppy wrote on the post where .ogg files were introduced (osu.ppy.sh/community/forums/topics/1021547) where he wrote:"Note that the bitrate rules are NOT for legal reasons as people may have incorrectly stated, but to ensure downloads size is within acceptable limits (especially important as we push forward with mobile platforms)."

Now let's compare 3 different audios (I can give MANY other examples, feel free to contact me for any additional audios and testing).
Here i'll show 3 different situations: 1st will be a 4.30 min song which is pretty intense and full of details for the whole song where the .ogg is slightly bigger than the .mp3. 2nd i'll show a 5.50 min song which is just like the 1st song regarding the intensity, where the .ogg is slightly bigger than the .mp3. Lastly i'll show a 6.30 min song which is pretty calm for the first 4 mins, starting to get more intense only in the end, where surprisingly the .ogg will be smaller than the mp3.

1st: My Days by Suzuki Konomi, the .mp3 weights 6,21MB, the .ogg weights 6.34MB
Download links:2nd: Sultain Of Swings by Dire Straits, the .mp3 weights 7.97MB, the .ogg weights 8.35MB.
Download links:3rd: I'm Glad You're Evil Too by PinocchioP, the .mp3 weights 8.95MB, the .ogg weights 8.27MB
Download links:
As you can see, .ogg compression will give different weights depending on the song. VBR6 .ogg Orchestral-like songs will be often bigger than the 192kbps .mp3 counterpart, meanwhile the very calm piano-voice songs will be often smaller than the counterpart.
On average, VBR6 seems the best VBR to approximate the 192kbps mp3 weight.
And that's why i'm asking to be more precise in the audio rules to avoid possible confusion for the BNs/QATs when .ogg files will be used in a bubble/qualified/ranked mapset.
I've used long songs to exaggerate the differences in weights, on TV Size songs the difference is often negligible.

How to know if a ogg file is a VBR6:
Spek (0.8.2 version, the latest) will always say that the file is at 192kbps from my testing. I know that this isn't the best method, but seems like the easiest.
If i understood correctly you can use the documentation given by the autors at https://xiph.org/downloads/ to be precise the vorbis-tools, however this is way more complex and seems like you'll actually need to compile your own program.

Little semi-OT:
Non correlated to the rules itself, but .ogg files aren't still properly supported by osu, you can't drag and drop like an mp3 to start editing a mapset, you need to make one with a mp3 and then change it for a .ogg and change the .osu of the diffs to pick the right audio file. A fix for this would be really appreciated :blobheart:
Naxess
you can determine bitrate for .oggs similarly to .mp3s, but I don't have an idea how to determine if a .ogg is VBR6, so unless there's a way to do that, idk how we'd enforce it (and having a rule that cant be enforced seems a bit odd)

spek may show 192 kbps for VBR6, but that's a bitrate indicator and if the actual bitrate isn't 192 kbps, it'd seem strange relying on it (+it being an external program which rc generally avoids)

tl;dr: how to check if VBR6
Topic Starter
Shoenen

Naxess wrote:

you can determine bitrate for .oggs similarly to .mp3s, but I don't have an idea how to determine if a .ogg is VBR6, so unless there's a way to do that, idk how we'd enforce it (and having a rule that cant be enforced seems a bit odd)

spek may show 192 kbps for VBR6, but that's a bitrate indicator and if the actual bitrate isn't 192 kbps, it'd seem strange relying on it (+it being an external program which rc generally avoids)

tl;dr: how to check if VBR6
Sadly i don't know how to check it outside spek. It will also show different kbps for VBR7 (224kbps) and VBR5 (160kbps), I think we either accept this or remove .ogg files as audio because of their nature and the impossibility to check their encoding info (VBR in this case) in an easy way like the mp3s.

Also people who will check audios of maps will be prolly using external programs anyway, since they are essential to distinguish between upscaled 192kbps and proper 192kbps mp3s and stuff like that. In fact an upscaled 192kbps is file size bloating, another unrankable problem, and the only way to be sure it is upscaled is by looking at the spectrum and compare it with a supposedly better quality one. So even right now a rule requires an external program to be checked (and a better audio too).
Xinnoh
if there is no way to check, the mapper could just provide the mp3 they used since 99% of cases will have the file as mp3 initially.
Topic Starter
Shoenen

Sinnoh wrote:

if there is no way to check, the mapper could just provide the mp3 they used since 99% of cases will have the file as mp3 initially.
This seems good, but because of how .ogg works, the source file should be a 320kbps .mp3 or .flac audio (Or even .wav, but this is a very rare case, since usually to get .wav you have to ask the artists themselves to send you it).
Ofc i'm implying that those files are not upscaled.
-Keitaro
i think ffmpeg is way much more reliable in terms of bitrate, look below:

Audio sourced from a 320kbps mp3 album
stripped config because goddamn thats a lot of text lol

First song
VB5
ffmpeg version 4.2.3 Copyright (c) 2000-2020 the FFmpeg developers
  built with gcc 9.3.1 (GCC) 20200523
  configuration: bruh
  libavutil      56. 31.100 / 56. 31.100
  libavcodec     58. 54.100 / 58. 54.100
  libavformat    58. 29.100 / 58. 29.100
  libavdevice    58.  8.100 / 58.  8.100
  libavfilter     7. 57.100 /  7. 57.100
  libswscale      5.  5.100 /  5.  5.100
  libswresample   3.  5.100 /  3.  5.100
  libpostproc    55.  5.100 / 55.  5.100
Input #0, ogg, from '5.ogg':
  Duration: 00:04:12.75, start: 0.000000, bitrate: 164 kb/s
    Stream #0:0: Audio: vorbis, 44100 Hz, stereo, fltp, 160 kb/s
    Metadata:
      COMMENTS        : Visit http://dmdokuro.bandcamp.com
      TITLE           : Roar of the Jungle Dragon
      Band            : DM DOKURO
      ARTIST          : DM DOKURO
      track           : 26
      DATE            : 2019
      ALBUM           : The Tale of a Cruel World
VB6
ffmpeg version 4.2.3 Copyright (c) 2000-2020 the FFmpeg developers
  built with gcc 9.3.1 (GCC) 20200523
  configuration: bruh
  libavutil      56. 31.100 / 56. 31.100
  libavcodec     58. 54.100 / 58. 54.100
  libavformat    58. 29.100 / 58. 29.100
  libavdevice    58.  8.100 / 58.  8.100
  libavfilter     7. 57.100 /  7. 57.100
  libswscale      5.  5.100 /  5.  5.100
  libswresample   3.  5.100 /  3.  5.100
  libpostproc    55.  5.100 / 55.  5.100
Input #0, ogg, from '6.ogg':
  Duration: 00:04:12.75, start: 0.000000, bitrate: 195 kb/s
    Stream #0:0: Audio: vorbis, 44100 Hz, stereo, fltp, 192 kb/s
    Metadata:
      COMMENTS        : Visit http://dmdokuro.bandcamp.com
      TITLE           : Roar of the Jungle Dragon
      Band            : DM DOKURO
      ARTIST          : DM DOKURO
      track           : 26
      DATE            : 2019
      ALBUM           : The Tale of a Cruel World
VB7
ffmpeg version 4.2.3 Copyright (c) 2000-2020 the FFmpeg developers
  built with gcc 9.3.1 (GCC) 20200523
  configuration: bruh
  libavutil      56. 31.100 / 56. 31.100
  libavcodec     58. 54.100 / 58. 54.100
  libavformat    58. 29.100 / 58. 29.100
  libavdevice    58.  8.100 / 58.  8.100
  libavfilter     7. 57.100 /  7. 57.100
  libswscale      5.  5.100 /  5.  5.100
  libswresample   3.  5.100 /  3.  5.100
  libpostproc    55.  5.100 / 55.  5.100
Input #0, ogg, from '7.ogg':
  Duration: 00:04:12.75, start: 0.000000, bitrate: 224 kb/s
    Stream #0:0: Audio: vorbis, 44100 Hz, stereo, fltp, 224 kb/s
    Metadata:
      COMMENTS        : Visit http://dmdokuro.bandcamp.com
      TITLE           : Roar of the Jungle Dragon
      Band            : DM DOKURO
      ARTIST          : DM DOKURO
      track           : 26
      DATE            : 2019
      ALBUM           : The Tale of a Cruel World

Second song
VB5
ffmpeg version 4.2.3 Copyright (c) 2000-2020 the FFmpeg developers
  built with gcc 9.3.1 (GCC) 20200523
  configuration: nyan
  libavutil      56. 31.100 / 56. 31.100
  libavcodec     58. 54.100 / 58. 54.100
  libavformat    58. 29.100 / 58. 29.100
  libavdevice    58.  8.100 / 58.  8.100
  libavfilter     7. 57.100 /  7. 57.100
  libswscale      5.  5.100 /  5.  5.100
  libswresample   3.  5.100 /  3.  5.100
  libpostproc    55.  5.100 / 55.  5.100
Input #0, ogg, from '5-1.ogg':
  Duration: 00:04:52.39, start: 0.000000, bitrate: 170 kb/s
    Stream #0:0: Audio: vorbis, 44100 Hz, stereo, fltp, 160 kb/s
    Metadata:
      TITLE           : Wish
      DISCID          : 4B057806
      ALBUM           : japanese text cmd cant render
      catalog         : 4988104115638
      GENRE           : anime
      ISRC (international standard recording code): JPV432000033
      track           : 3/6
      ARTIST          : japanese text cmd cant render
      DATE            : 2020
VB6
ffmpeg version 4.2.3 Copyright (c) 2000-2020 the FFmpeg developers
  built with gcc 9.3.1 (GCC) 20200523
  configuration: nyan
  libavutil      56. 31.100 / 56. 31.100
  libavcodec     58. 54.100 / 58. 54.100
  libavformat    58. 29.100 / 58. 29.100
  libavdevice    58.  8.100 / 58.  8.100
  libavfilter     7. 57.100 /  7. 57.100
  libswscale      5.  5.100 /  5.  5.100
  libswresample   3.  5.100 /  3.  5.100
  libpostproc    55.  5.100 / 55.  5.100
Input #0, ogg, from '6-2.ogg':
  Duration: 00:04:52.39, start: 0.000000, bitrate: 205 kb/s
    Stream #0:0: Audio: vorbis, 44100 Hz, stereo, fltp, 192 kb/s
    Metadata:
      TITLE           : Wish
      DISCID          : 4B057806
      ALBUM           : japanese text cmd cant render
      catalog         : 4988104115638
      GENRE           : anime
      ISRC (international standard recording code): JPV432000033
      track           : 3/6
      ARTIST          : japanese text cmd cant render
      DATE            : 2020
VB7
ffmpeg version 4.2.3 Copyright (c) 2000-2020 the FFmpeg developers
  built with gcc 9.3.1 (GCC) 20200523
  configuration: nyan
  libavutil      56. 31.100 / 56. 31.100
  libavcodec     58. 54.100 / 58. 54.100
  libavformat    58. 29.100 / 58. 29.100
  libavdevice    58.  8.100 / 58.  8.100
  libavfilter     7. 57.100 /  7. 57.100
  libswscale      5.  5.100 /  5.  5.100
  libswresample   3.  5.100 /  3.  5.100
  libpostproc    55.  5.100 / 55.  5.100
Input #0, ogg, from '7-2.ogg':
  Duration: 00:04:52.39, start: 0.000000, bitrate: 231 kb/s
    Stream #0:0: Audio: vorbis, 44100 Hz, stereo, fltp, 224 kb/s
    Metadata:
      TITLE           : Wish
      DISCID          : 4B057806
      ALBUM           : japanese text cmd cant render
      catalog         : 4988104115638
      GENRE           : anime
      ISRC (international standard recording code): JPV432000033
      track           : 3/6
      ARTIST          : japanese text cmd cant render
      DATE            : 2020

yeah so uhhh sometimes VBR6 seem to go way beyond our limit, so maybe VB5 is a better option?

Edit: here are the files for the above tests https://d.rorre.xyz/BfIRNOMEY/testing.zip

Edit 2:
Personally just look over the bitrate instead of the quality setting tbh, it could be VBR6 if its still around 192 since i dont think theres a way to exactly determine VBR quality *aside* from the audio stream output info always throws 160kbps for VB5, 192kbps for VBR6, 224kbps for VB7, yadayada. (one could argue this is an indicator which is fair enough, but we should reference it to the wiki too then)

though i kinda think VBR5 might be wanted instead (if we wanna push the VBR quality rule) to prevent 200kbps average happens and it looks like the audio quality doesnt drop a lot compared to original -> mp3 cbr 192, which i believe against the idea of 192kbps rule thing, but i might be wrong since i only sampled it with 2 songs xD
Topic Starter
Shoenen

-Keitaro wrote:

i think ffmpeg is way much more reliable in terms of bitrate, look below:

Audio sourced from a 320kbps mp3 album
stripped config because goddamn thats a lot of text lol

First song
VB5
ffmpeg version 4.2.3 Copyright (c) 2000-2020 the FFmpeg developers
  built with gcc 9.3.1 (GCC) 20200523
  configuration: bruh
  libavutil      56. 31.100 / 56. 31.100
  libavcodec     58. 54.100 / 58. 54.100
  libavformat    58. 29.100 / 58. 29.100
  libavdevice    58.  8.100 / 58.  8.100
  libavfilter     7. 57.100 /  7. 57.100
  libswscale      5.  5.100 /  5.  5.100
  libswresample   3.  5.100 /  3.  5.100
  libpostproc    55.  5.100 / 55.  5.100
Input #0, ogg, from '5.ogg':
  Duration: 00:04:12.75, start: 0.000000, bitrate: 164 kb/s
    Stream #0:0: Audio: vorbis, 44100 Hz, stereo, fltp, 160 kb/s
    Metadata:
      COMMENTS        : Visit http://dmdokuro.bandcamp.com
      TITLE           : Roar of the Jungle Dragon
      Band            : DM DOKURO
      ARTIST          : DM DOKURO
      track           : 26
      DATE            : 2019
      ALBUM           : The Tale of a Cruel World
VB6
ffmpeg version 4.2.3 Copyright (c) 2000-2020 the FFmpeg developers
  built with gcc 9.3.1 (GCC) 20200523
  configuration: bruh
  libavutil      56. 31.100 / 56. 31.100
  libavcodec     58. 54.100 / 58. 54.100
  libavformat    58. 29.100 / 58. 29.100
  libavdevice    58.  8.100 / 58.  8.100
  libavfilter     7. 57.100 /  7. 57.100
  libswscale      5.  5.100 /  5.  5.100
  libswresample   3.  5.100 /  3.  5.100
  libpostproc    55.  5.100 / 55.  5.100
Input #0, ogg, from '6.ogg':
  Duration: 00:04:12.75, start: 0.000000, bitrate: 195 kb/s
    Stream #0:0: Audio: vorbis, 44100 Hz, stereo, fltp, 192 kb/s
    Metadata:
      COMMENTS        : Visit http://dmdokuro.bandcamp.com
      TITLE           : Roar of the Jungle Dragon
      Band            : DM DOKURO
      ARTIST          : DM DOKURO
      track           : 26
      DATE            : 2019
      ALBUM           : The Tale of a Cruel World
VB7
ffmpeg version 4.2.3 Copyright (c) 2000-2020 the FFmpeg developers
  built with gcc 9.3.1 (GCC) 20200523
  configuration: bruh
  libavutil      56. 31.100 / 56. 31.100
  libavcodec     58. 54.100 / 58. 54.100
  libavformat    58. 29.100 / 58. 29.100
  libavdevice    58.  8.100 / 58.  8.100
  libavfilter     7. 57.100 /  7. 57.100
  libswscale      5.  5.100 /  5.  5.100
  libswresample   3.  5.100 /  3.  5.100
  libpostproc    55.  5.100 / 55.  5.100
Input #0, ogg, from '7.ogg':
  Duration: 00:04:12.75, start: 0.000000, bitrate: 224 kb/s
    Stream #0:0: Audio: vorbis, 44100 Hz, stereo, fltp, 224 kb/s
    Metadata:
      COMMENTS        : Visit http://dmdokuro.bandcamp.com
      TITLE           : Roar of the Jungle Dragon
      Band            : DM DOKURO
      ARTIST          : DM DOKURO
      track           : 26
      DATE            : 2019
      ALBUM           : The Tale of a Cruel World

Second song
VB5
ffmpeg version 4.2.3 Copyright (c) 2000-2020 the FFmpeg developers
  built with gcc 9.3.1 (GCC) 20200523
  configuration: nyan
  libavutil      56. 31.100 / 56. 31.100
  libavcodec     58. 54.100 / 58. 54.100
  libavformat    58. 29.100 / 58. 29.100
  libavdevice    58.  8.100 / 58.  8.100
  libavfilter     7. 57.100 /  7. 57.100
  libswscale      5.  5.100 /  5.  5.100
  libswresample   3.  5.100 /  3.  5.100
  libpostproc    55.  5.100 / 55.  5.100
Input #0, ogg, from '5-1.ogg':
  Duration: 00:04:52.39, start: 0.000000, bitrate: 170 kb/s
    Stream #0:0: Audio: vorbis, 44100 Hz, stereo, fltp, 160 kb/s
    Metadata:
      TITLE           : Wish
      DISCID          : 4B057806
      ALBUM           : japanese text cmd cant render
      catalog         : 4988104115638
      GENRE           : anime
      ISRC (international standard recording code): JPV432000033
      track           : 3/6
      ARTIST          : japanese text cmd cant render
      DATE            : 2020
VB6
ffmpeg version 4.2.3 Copyright (c) 2000-2020 the FFmpeg developers
  built with gcc 9.3.1 (GCC) 20200523
  configuration: nyan
  libavutil      56. 31.100 / 56. 31.100
  libavcodec     58. 54.100 / 58. 54.100
  libavformat    58. 29.100 / 58. 29.100
  libavdevice    58.  8.100 / 58.  8.100
  libavfilter     7. 57.100 /  7. 57.100
  libswscale      5.  5.100 /  5.  5.100
  libswresample   3.  5.100 /  3.  5.100
  libpostproc    55.  5.100 / 55.  5.100
Input #0, ogg, from '6-2.ogg':
  Duration: 00:04:52.39, start: 0.000000, bitrate: 205 kb/s
    Stream #0:0: Audio: vorbis, 44100 Hz, stereo, fltp, 192 kb/s
    Metadata:
      TITLE           : Wish
      DISCID          : 4B057806
      ALBUM           : japanese text cmd cant render
      catalog         : 4988104115638
      GENRE           : anime
      ISRC (international standard recording code): JPV432000033
      track           : 3/6
      ARTIST          : japanese text cmd cant render
      DATE            : 2020
VB7
ffmpeg version 4.2.3 Copyright (c) 2000-2020 the FFmpeg developers
  built with gcc 9.3.1 (GCC) 20200523
  configuration: nyan
  libavutil      56. 31.100 / 56. 31.100
  libavcodec     58. 54.100 / 58. 54.100
  libavformat    58. 29.100 / 58. 29.100
  libavdevice    58.  8.100 / 58.  8.100
  libavfilter     7. 57.100 /  7. 57.100
  libswscale      5.  5.100 /  5.  5.100
  libswresample   3.  5.100 /  3.  5.100
  libpostproc    55.  5.100 / 55.  5.100
Input #0, ogg, from '7-2.ogg':
  Duration: 00:04:52.39, start: 0.000000, bitrate: 231 kb/s
    Stream #0:0: Audio: vorbis, 44100 Hz, stereo, fltp, 224 kb/s
    Metadata:
      TITLE           : Wish
      DISCID          : 4B057806
      ALBUM           : japanese text cmd cant render
      catalog         : 4988104115638
      GENRE           : anime
      ISRC (international standard recording code): JPV432000033
      track           : 3/6
      ARTIST          : japanese text cmd cant render
      DATE            : 2020

yeah so uhhh sometimes VBR6 seem to go way beyond our limit, so maybe VB5 is a better option?

Edit: here are the files for the above tests https://d.rorre.xyz/BfIRNOMEY/testing.zip

Edit 2:
Personally just look over the bitrate instead of the quality setting tbh, it could be VBR6 if its still around 192 since i dont think theres a way to exactly determine VBR quality *aside* from the audio stream output info always throws 160kbps for VB5, 192kbps for VBR6, 224kbps for VB7, yadayada. (one could argue this is an indicator which is fair enough, but we should reference it to the wiki too then)

though i kinda think VBR5 might be wanted instead (if we wanna push the VBR quality rule) to prevent 200kbps average happens and it looks like the audio quality doesnt drop a lot compared to original -> mp3 cbr 192, which i believe against the idea of 192kbps rule thing, but i might be wrong since i only sampled it with 2 songs xD
So, you used a 320kbps .mp3, an already lossy format, and reencoded it in .ogg. VBR6 is meant to be around the quality of a loseless audio, and that is shown also by the spectrum, for example a vbr6 will often go over 20khz, meanwhile a 320kbps has an hard-cap at 20khz. This also means that the .ogg had to work with what the .mp3 algorythm "thought" was usefull from the original loseless format (prolly .wav if the song is directly sell by the artist, or .flac if it was from a CD). .ogg and .mp3, as also stated in the FAQs, are very different in reencoding the song, in fact this is why i used loseless audios as sources, to give at both of the formats the same starting ground and make the "competition" as fair as possible.

Another thing, VBR5 will be always around 1MB less* than the CBR 192kbps .mp3 (Usually the mapsets that get ranked use 192 CBR), so in fact it won't approximate the 192kbps .mp3 at all, meanwhile VBR6, sometimes going over and sometimes going under, actually is the best approximation to the 192kbps .mp3. The fact that VBR5 sounds even better than 192kbps despite being lighter should not prevent us from searching a better approximation. This is why i'm almost ignoring the bitrate and focussing on the file size, since this is also what really matters for peppy and the devs.

Also last thing, when you said "if we wanna push the VBR quality rule", it sounded like we could avoid pushing this rule. We CANNOT avoid pushing this rule. Currently I've 2 non-uploaded mapsets in my osu folder, and MV says both of them are 193kbps average, yes exactly, 1 kbps over the limit. We are currently bounding a completely different format to another format's standards. It's like bounding the sliders to follow circle's rules, despite the slider is a completely different object (I know, this similitude it's not that good, but i hope you get the idea).
This is why, as i said to Naxess, either we accept the rules of .ogg and make a slightly change to the RC according to them, or we just keep staying in this grey area where VBR5 is safe but VBR6 is 50-50 chance to be over the 192kbps average. And if we're talking about objective unrankables, we should not leave any gray area, ANY.

*
This 1mb less is an average from my testing, sometimes it is even more and sometimes it is less, but it has never gone over the 192kbps CBR .mp3 file size, always under it.
As stated in the poem box, feel free to contact me if you need any additional testing. I'm not posting douzens of spectros and ogg-mp3 comparison just to avoid filling the whole discussion with way too much images. Also feel free to contact me if you want the .flac files and you want to do the test yourself, i'll happily provide them to you.
-Keitaro
Edit: we talked in discord, heres summary

ffmpeg and spek seems to be using the formula mentioned here for "bitrate" reference. this could suffice for looking out which vbr settings the file uses but i doubt its available on libBASS that osu! and MV uses.

with the formula above i believe that its possible to check VBR6 with Spek and/or FFMpeg, however the turnpoint is that AIMod might not be able to check it properly due to it possibly being above 192kbps.

we already used other tools for checking audio, for example in the case of 128kbps audio being in the container of 192kbps so i think its still fair enough to use external tools in order to check VBR quality settings.

the difference in filesize isnt very huge, notably below 1MB in a 4mins song, so in any tv size songs it wouldnt be that huge in the first place, I'll provide samples later.

Edit 2:
Here's DragonForce album converted from flac to VBR6: https://pastebin.com/raw/z6DyARMX some goes up, some stays, some goes down compared to 192kbps CBR mp3 (ffmpeg with libvorbis)

From Shoenen: (Audacity)
MP3     OGG
5.83 - 5.82
6.13 - 6.00
5.10 - 4.99
9.02 - 8.28
6.63 - 6.45
6.63 - 6.61
From Minase Inori, Blue Compass album
The first 5 songs+the last

MP3     OGG
6.02 - 6.10
6.83 - 7.06
5.71 - 5.93
5.39 - 5.55
5.82 - 6.00
5.63 - 6.01
All of those songs are around 4 min long
Some are more close to 5
Those are from FELT, Rebirth Story 3
Naxess
Seems like the bitrate reference Keitaro- linked is what ffmpeg and spek use to approximate bitrate, which Shoenen claimed to be consistently 192 kbps for VBR6.

So in other terms, what this proposal wants is to push the average bitrate threshold from 192 kbps to 208 kbps (32 kbps * 6.5, highest quality before being rounded up to VBR7, 224 kbps), so that it better supports the variance in bitrate when encoding OGG files using VBR6?

If so, this might as well be applied to MP3 files as well for simplicity, so they may be VBR encoded easier using V 2 / V 3, as described here. This problem (VBR targets not being exact) isn't exclusive to OGG files, after all.

As a result we get a simpler proposal as well:
A beatmapset's audio file must use the .mp3 or .ogg file format and have an average bitrate no greater than 208 kbps.
Am neutral on this myself, but this would make for a way better change than including VBR quality jargon into the RC.
Topic Starter
Shoenen
I agree, "A beatmapset's audio file must use the .mp3 or .ogg file format and have an average bitrate no greater than 208 kbps." is better in technical terms.
Also side note about "[..] Shoenen claimed to be consistently 192 kbps for VBR6.", Spek will show 192kbps also for VBR2 mp3s, but this is not as consistent as VBR6 at all tho, sometimes spek will say 160kbps and sometimes 224kbps. To be clear, this isn't the correct average, it is only an indicator of the settings used for .ogg and a not that much reliable indicator for mp3s.

My only concern about pushing 208kbps as maximum is what Peppy and the devs will say about it. We know that 192kbps is fine and the only thing that matters for them is to keep the file size relatively small, they don't care about the actual audible audio quality.

I'm gonna change the proposed rule to what Naxess wrote, however i won't be able to rewrite the whole poem section to summarize also what Naxess and Keitaro said, however i'll kindly ask the reader to read all the discussion.
Dialect
not an audiophile but i think this could be a cool idea. but i don't know if it'll be added anytime soon considering that .ogg files are still not properly supported. maybe if ppy and devs updated stable to support ogg files?
Topic Starter
Shoenen

Li Syaoran wrote:

not an audiophile but i think this could be a cool idea. but i don't know if it'll be added anytime soon considering that .ogg files are still not properly supported. maybe if ppy and devs updated stable to support ogg files?
To be clear, stable supports .ogg, it's just that drag & drop for them doesn't quite work, meanwhile it is flawless for mp3s. However stable doesn't have any problem using the .ogg as beatmapset's audio and both the editor and playing work perfectly with oggs
-Keitaro
^

Naxess' proposal sounds good to me, very rarely anything goes beyond 205kbps anyway with these settings so im good with it.
clayton
it would be a little odd because you lose the implication that 192kbps is a good standard for mp3. I think you should make distinction between mp3 and ogg limits here, but you can go more in detail in another wiki article that clears up any jargon and just link to it there (was the whole point of pishi rewriting RC like that)
Topic Starter
Shoenen

clayton wrote:

it would be a little odd because you lose the implication that 192kbps is a good standard for mp3. I think you should make distinction between mp3 and ogg limits here, but you can go more in detail in another wiki article that clears up any jargon and just link to it there (was the whole point of pishi rewriting RC like that)
Actually VBR2 mp3s are slightly better than 192kbps CBR mp3s, so it's kinda fine that we lose the implication that 192kbps mp3 is a good standard, since now we can have slightly higher one.
Illyasviel
Just to note, VORBIS files do have a noticeable impact in vocal clarity compared to mp3 files using LAME (192kbps CBR MP3 vs VBR 6 for VORBIS) so I wouldn't recommend using .ogg in any vocal heavy songs.

As for the rewording proposal, because there is no easy way to determine the exact bitrate of .ogg files, the rule should be worded differently for each format. This will avoid confusion to the average user when exporting vorbis files, since Audacity only mentions "Quality" and not bitrate. So something along these lines would be significantly more clear:

"A beatmapset's audio file must use the .mp3 or .ogg file format. For a .mp3 file, the maxium bit rate (Variable or Constant) must not exceed 224kbps. For .ogg files, no more than Quality 6 (VBR6) must be used."

This way the .mp3 quality would get slightly increased to match Quality 6 of .ogg files while still being clear to users what settings they should use when exporting either format.

I don't know why people in this thread thinks that changing the bitrate to an arbitrary number that doesn't even exist when exporting either .mp3 or .ogg files is a good idea. Please tell me how someone who isn't familiar with how VBR works (which is 99% of osu! mappers/players) is going to tell what setting is rankable when their options when exporting audio look like this



or this

Topic Starter
Shoenen

Illyasviel wrote:

Just to note, VORBIS files do have a noticeable impact in vocal clarity compared to mp3 files using LAME (192kbps CBR MP3 vs VBR 6 for VORBIS) so I wouldn't recommend using .ogg in any vocal heavy songs.
Please give me the evidences. I have pretty good headphones and i can't find any difference in vocal clarity, instead i can even hear a slightly more clear voice (because of the better encoding).
Both with searching online in many forums and also by reading the official FAQs, i didn't read anything about voice clarity.

Illyasviel wrote:

don't know why people in this thread thinks that changing the bitrate to an arbitrary number that doesn't even exist when exporting either .mp3 or .ogg files is a good idea.
No, this isn't a an arbitraty bitrate, it is actually the highest Average Bit Rate you'll have with .mp3 VBR2 encoding and .ogg VBR6. We are considering the "worst case", in which you have the highest possible Average Bit Rate.

About "Please tell me how someone who isn't familiar with how VBR works is going to tell what setting is rankable", they don't need to. BNs/QATs are entitled to check the rankability of an audio, and they already have plenty of tools and people to ask to for this job.
Also, tbh, most mappers when uploading maps use even upscaled songs which are totally unrankable because of file size bloat. (Also a lot of mapper don't even know what RC is...)


As the last thing, i would like to keep this thread respectful, and as everyone who've contributed to this thread, give us tests/reasons about what you're claiming instead of even attacking all of us like in the last quote i've linked.

I'll happily wait your evidences about the voice clarity thing too.
Illyasviel

Shoenen wrote:

Please give me the evidences. I have pretty good headphones and i can't find any difference in vocal clarity, instead i can even hear a slightly more clear voice (because of the better encoding).
Both with searching online in many forums and also by reading the official FAQs, i didn't read anything about voice clarity.

I'll happily wait your evidences about the voice clarity thing too.
Sure, you can use this http://irya.u.catgirlsare.sexy/WAo-W5BI.zip
Vocals sound thinner when enconded using VORBIS. This is noticeable in parts like 1:25 to 1:28. This effect is more noticeable if the song has many vocals (Idol songs for example) so you can try comparing mp3 vs vorbis using idols songs too.

Shoenen wrote:

No, this isn't a an arbitraty bitrate, it is actually the highest Average Bit Rate you'll have with .mp3 VBR2 encoding and .ogg VBR6. We are considering the "worst case", in which you have the highest possible Average Bit Rate.

About "Please tell me how someone who isn't familiar with how VBR works is going to tell what setting is rankable", they don't need to. BNs/QATs are entitled to check the rankability of an audio, and they already have plenty of tools and people to ask to for this job.
Also, tbh, most mappers when uploading maps use even upscaled songs which are totally unrankable because of file size bloat. (Also a lot of mapper don't even know what RC is...)
Unfortunately for the average mapper/player, it is an arbitrary bitrate. While 192kbps is a known bitrate, 208 kbps isn't even listed in any kind of settings when exporting audio. The whole point of the ranking criteria is to help mappers to know if what they are doing is correct. Not to make them wait for a BN to use a tool to see if their audio is even rankable. That will only lead to even more frustration when attempting to rank a map, especially for new mappers. Saying that it's a QAT/BNs job defeats the whole point of the RC.


Shoenen wrote:

As the last thing, i would like to keep this thread respectful, and as everyone who've contributed to this thread, give us tests/reasons about what you're claiming instead of even attacking all of us like in the last quote i've linked.
I haven't insulted anyone here, and if it looks like I did I apologize. But you guys are thinking about the ranking criteria as a super elite document that needs to have numbers and formulas that make sense under your point of view, when the first thing you say to a new mapper is to read the ranking criteria. The more people that understands the wording, the better. After all, BNs and QATs are busy enough dealing with other matters.
-Keitaro

Illyasviel wrote:

Unfortunately for the average mapper/player, it is an arbitrary bitrate. While 192kbps is a known bitrate, 208 kbps isn't even listed in any kind of settings when exporting audio. The whole point of the ranking criteria is to help mappers to know if what they are doing is correct. Not to make them wait for a BN to use a tool to see if their audio is even rankable. That will only lead to even more frustration when attempting to rank a map, especially for new mappers. Saying that it's a QAT/BNs job defeats the whole point of the RC.
I mean I guess we can just extend it to be more clear...

Proposal wrote:

A beatmapset's audio file must use the .mp3 or .ogg file format and have an average bitrate no greater than 208 kbps. It is achieveable by exporting with quality setting 6 for OGG, 170-210 kbps for VBR, and 192 kbps for ABR/CBR.
This should be a decent guide for them to export the files without leaving any confusion.
in 170-210kbps setting, anything above 208 kbps as result is extremely borderline though, so its either that option or 155-195 kbps option.
pishifat

Shoenen wrote:

Another thing, VBR5 will be always around 1MB less* than the CBR 192kbps .mp3 (Usually the mapsets that get ranked use 192 CBR), so in fact it won't approximate the 192kbps .mp3 at all, meanwhile VBR6, sometimes going over and sometimes going under, actually is the best approximation to the 192kbps .mp3.

Shoenen wrote:

The fact that VBR5 sounds even better than 192kbps despite being lighter should not prevent us from searching a better approximation.
this basically tells me that VBR6 should be used if it's under 192kbps average and VBR5 should be used otherwise. shifting the limit up seems unnecessary? majority of people will continue to use mp3 and those who care about the technicalities of .ogg audio can use that
Please sign in to reply.

New reply