forum

Yet Another Mania Star Rating Rework

posted
Total Posts
4
Topic Starter
T-Bone Shark
Since reading is overrated here is a spreadsheet instead: https://docs.google.com/spreadsheets/d/1nW50Vxp8Dzn8fQq3tbeVeRTtaVFI-e3Xrggkvpzv-aI/edit?usp=sharing

As many of us know the star rating for mania is pretty meh. Some songs are very overweighed (you know which ones), others less so. Hold notes are imo also underrated, and their effect on difficulty is dependent on note order in the .osu file itself. This causes issues for newer players who might end playing a song much harder than given credit for, or leading them astray when they pass a low 4 star but struggle on other songs a whole star below.

My first pass on attempting to resolve these issues was broken into two parts:
  1. Holds
    1. Reworking hold notes so that the act of releasing on the tail is considered.
    2. Rewarding the act of just holding a note.
    3. Rewarding holding more than one notes while playing others at the same time (current only cares about one).
  2. Diff Spike
    1. Forcing the current equation to look at a wider set of max strains.
    2. Punishing songs that spike hard but have low averages, and rewarding maps that instead have averages closer to the max.

Also during the process I enforced an object parse order and fixed a couple input variant cases.
  1. Objects sharing same time position do not share same overall difficulty. Leads to https://github.com/ppy/osu/issues/4010
  2. Hold note bonuses dependent on parse order.
  3. First note of a song has no individual strain.

What lead to me working on this project in the first place.
Story time!
It was late September and I was still very bad at the game (implying I have improved ha). A good number of my top scores were still 5K converts (Melody Flag was one, not even 3 stars). I had a very hard time with ultimate ascension and illusion of inflict. Could pass them but scores were around the 680K range. These were only just above 3.2 stars.
Everything changed when I played Nirvana.
Not even 700K and it became my highest pp play and would perpetually swing back into #1 position every time I played it.
Meanwhile Heaven's Fall on Advanced I was lucky to get more than 650K.

So yeah to say I was confused would be putting it lightly.


Preface: Some of this code has already been ported over to the lazer client here -> https://github.com/nbayt/osu/tree/master/osu.Game.Rulesets.Mania
Edit: Lazer version is up to date, but expect most songs to be off by 0.01 - 0.02 stars.

Lets dive into the actual changes shall we?
Note: Whenever I say beat I mean notes sharing the same start time. Also I might ramble a bit in this section, let me know if I can tidy it up.

The following section applies to column E (New) in the spreadsheet.
Change 1
First simple change is the sorting order for the hit objects. It was originally a stable sort with ascending start time. It is now ascending start time, and then descending end time, and then increasing column position. More on this in a moment.

Next up is the changes made to hold notes, the values given are not final obviously but they seem okay from my testing thus far.
  1. First change is that hold factor is no longer a binary value, (1.25 if a note is held, otherwise 1.0). It is now 1.25 for the first held note, and increased by (1.65 - hold factor) * 0.5 for each additional hold note.
  2. Hold addition logic is still the same as before, but with the new sorting logic is no longer able to be obtained from hold notes starting on the same beat. Can only be awarded from earlier hold notes.
  3. If multiple holds share the same end time, then only the leftmost one gets the bonus, would like to hear feedback on this. This is also how it works currently but based on first note listed in .osu file.
  4. Individual decay in a column is reduced to 0.18 from 0.125 (.82 reduction from .875 reduction) while a note is being held.
    1. Time after a note is released is decayed as normal until the next hit object in that column.
  5. The tail of a hold gives individual difficulty to it's respective column with a base value of 0.7. (Heads for taps and holds is 2.0 normally).
    1. Hold factor for this note as well. Starts at 1.0 and increases by (1.30 - hold factor) * 0.45 for each hold.
  6. The tail of a hold gives overall difficulty with a flat bonus of 0.35. (No other modifiers). (Heads for taps and holds is 1.0 normally).

The next small change made was to the code that computes the star rating after hit object difficulty is computed.
  1. Originally the code would find the largest difficulty value in fixed 400ms sections, do a decreasing sort on it, then do a weighted sum.
    1. The weight ratio was changed from 0.9 to 0.92. Since the value of series now converges towards a larger value it is then multiplied by 0.8.

Honorable mentions of the listed changes:
  1. Triumph & Regret (5.46, 4.12) -> (5.34, 4.08)
  2. Heaven's Fall [Bye4Now's Advanced] (2.73) -> (3.36)
  3. Heaven's Fall [Bye4Now's Hard] (2.95 -> 3.30)
  4. 4K Dans la mer de son [Toaph's Abyss] (5.23 -> 6.39)
  5. Puffs Rave (Jakads' #SDVX_Edit) [SOUND VOLTEX IS | TAKEN OVER BY OSU!!] (792.28 -> 3490.71) Hmm...


The following section applies to column G (Diff Spike Chg) in the spreadsheet.
Change 2
Next up is the reward / punishment system based on average and max difficulty. This step is applied after the current changes, it does not replace the current star rating calculation.
  1. Side note: Songs below 2.5 stars are affected by this, but by a reduced linear amount.
  2. Song is broken up into 400ms sections, but this time each section contains the sum of all hit object difficulties rather than a singular max. From these a max and an average is computed.
  3. We then compute what I call the 'error' in the difficulties. For each section we increase the error by the given equation.
    1. ((strains_maximum - strain) / strains_avg) ** 2.25
    2. If most values are near the average, then it behaves like: (max/avg - avg/avg) -> (max/avg) - 1. So having an average near the max reduces error.
  4. This value is then averaged by the number of 400ms sections.
  5. Using this we decide if a song should be punished or rewarded based on the error.


rating_weight = min((star_rating / 2.5), 1.0)
final_rating = (1 - (.02 * rating_weight)) * star_rating
if avg_error <= 0.13:
bonus = 1 + ((((0.13 - avg_error) / 0.13) ** 1.80) * 0.10) * rating_weight
final_rating *= bonus
elif avg_error > 0.13:
penalty = 1 - ((0.12 - (((max(0.15 - (avg_error - 0.13), 0.0) / 0.15) ** 1.45) * 0.12))) * rating_weight
final_rating *= penalty
return final_rating

  1. Bonus can range from 0% - 10% (error 0.13 - 0.0)
  2. Penalty can range from -12% - 0% (error 0.28 - 0.13)

This still needs a lot of work, bonus is too strong in many cases, but penalty appears to work decently well. Open to any suggestions about this change.

Honorable mentions of the listed changes:
  1. Triumph & Regret (5.46, 4.12) -> (4.59, 4.00)
  2. Nirvana(Camellia's "BinaryHeaven" Remix) [Ascension] (4.35 -> 3.83)
  3. Fascination MAXX [_UJ's 4K Fascination] (4.73 -> 5.28)
  4. Tsukidokei ~ Luna Dial [Lunatic] (3.78 -> 4.01)
  5. Galaxy Collapse [Cataclysmic Hypernova] (6.43 -> 5.72)


Closure:
First off most of you can probably guess this was primarily tested on 4K maps. I don't consider myself good enough with 7K to know how this affects the ratings on that side; let alone 6K or 8K. (Doppelganger getting buffed from 9.21 to 9.61 sounds like it could be a problem).

I feel like the first set of changes are in a good spot right now, much better than the way it was before. Though confirmation from people that play at a higher level is needed. That said I don't think the second set of changes are as good as I would like. The bonus value is way too much, and punishment needs to be reduced slightly. It also seems rather arbitrary if you know what I mean.

Further reading: https://github.com/ppy/osu/issues/2553
Also just noticed this: https://github.com/ppy/osu/pull/2449 Will attempt to test this change out.

Thanks for reading, feedback is greatly appreciated regarding my text dump here. Have a great day!
Bobbias
Looking at some of the 7k stats, I noticed a very clear trend. LN based charts are now overrated. All the harder 7k LN charts (let's say 5 stars and up) get huge boosts to SR, to the point of overrating them IMO.

I can't necessarily say HOW overrated, but the numbers feel a bit higher than I'd expect. I mean, sister's noise is not a 7, I'd say it should be more like a mid 6 of some sort.
Elementaires
I agree with Bobbias, I took a quick look at every LN charts I know on every keys and I think it doesnt need that much of a boost in star rating. Especially with already pp maps we have on 7K rankings.
I think its too difficult to find a good balance... but maybe with a buff that high, it will finally counterbalance the DT speed spam meta? (which im not complaining ;))
Topic Starter
T-Bone Shark
Thanks for the feedback, really appreciate it. I ended up nerfing most of the LN changes in general, especially the ones controlling overall difficulty. The aforementioned song now ends up at 6.79, so I still have some tweaking I need to do to bring it down a bit more. Got it down to 6.5, hopefully this is more reasonable!

I updated the spreadsheet with the new values, and I'll also grab some more songs to check as well. Let me know let you think.
Please sign in to reply.

New reply