forum

Mania pp Algorithm

posted
Total Posts
5
Topic Starter
Ciel
Preface: If this post comes off harsh, I'm sorry about that. I am not intending to blame Tom for anything that I am going to talk about, due to the fact that he has mentioned he is not currently working on the pp algorithm, as well as the fact that he doesn't actually play this game mode.

So recently on reddit, there was a post about how Tom has released the pp calculation algorithms involved in the game. Unfortunately, this does not include any info about how various attributes about each beatmap are calculated, but that's not what I'm going to be talking about anyways.

As shown in both the osu wiki as well as in the actual source itself, there are two parts to calculating the value of any given play: Accuracy and Strain. However, I have found that there are issues in both aspects of calculating these scores.

Accuracy

(relevant code is in lines 127-144)

In order to calculate the Accuracy value of a play, there is a single formula that is used, followed by a multiplier given by the length of the map. However, this results in a scenario where there is indeed, a maximum value that could be theoretically obtained as a result.

Given a map with >2390 notes, and also given a perfect 100% accuracy play, given a OD 8 map, one of the most common OD's used, we find that the maximum amount of pp earned from this section is a whopping 30 pp. Note that this is completely irregardless of how difficult the map actually is, which results in this statistic being over valued in easier maps, while being potentially undervalued in harder maps. However, Tom mentioned that this isn't necessarily an issue, as the Strain value of the pp calculation is also affected by the accuracy as well, so this may not be terrible. That is, until you actually look at how Strain value is computed.
This is the math box for Accuracy.

_accValue = pow((150.0f / hitWindow300) * pow(Accuracy(), 16), 1.8f) * 2.5f;
Accuracy() is a value between 0.0 and 1.0, so we will take the highest.
In addition, OD8 has a hit window of 40.5ms
_accValue = pow((150.0f / 40.5f) * pow(1.0f, 16), 1.8f) * 2.5f
= 26.392785805...

_accValue *= std::min<f32>(1.15f, pow(static_cast<f32>(TotalHits()) / 1500.0f, 0.3f));
Thus, once TotalHits(), or the total objects in the map, exceeds 2390.107 objects, the multiplier reaches its maximum value of 1.15.
_accValue = 30.3517036757...

Strain

(relevant code is in lines 82-125)

Each map is given a strain attribute, which is calculated through some process. Afterwards, the strain value of any given play is computed as follows:

If your score is <=500000, your strain value is multiplied by 0.10*(score/500000)
If your score is <=600000, your strain value is multiplied by 0.20*((score-500000)/100000) + 0.10
If your score is <=700000, your strain value is multiplied by 0.35*((score-600000)/100000) + 0.30
If your score is <=800000, your strain value is multiplied by 0.20*((score-700000)/100000) + 0.65
If your score is <=900000, your strain value is multiplied by 0.10*((score-800000)/100000) + 0.85
If your score is >=900000, your strain value is multiplied by 0.05*((score-900000)/100000) + 0.95

Now that looks like a number dump, so let me explain it better. First, given a perfect play, the maximum number of points you can earn from this map is equal to the strain value. Lets call this V.

From the score range of 000000 to 500000, your score scales linearly from 0.00*V to 0.10*V (0.10 range)
From the score range of 500000 to 600000, your score scales linearly from 0.10*V to 0.30*V (0.20 range)
From the score range of 600000 to 700000, your score scales linearly from 0.30*V to 0.65*V (0.35 range)
From the score range of 700000 to 800000, your score scales linearly from 0.65*V to 0.85*V (0.20 range)
From the score range of 800000 to 900000, your score scales linearly from 0.85*V to 0.95*V (0.10 range)
From the score range of 900000 and above, your score scales linearly from 0.95*V to 1.00*V (0.05 range)

This final strain value is thus your Strain value for the play.

This is unfortunately broken in some pretty glaring ways. First off, note how the score, while it does not overall scale linearly, scales linearly between certain ranges. This means that over certain ranges, there is no smooth progression between score necessarily. The most glaring issue, however, is the fact that people have already shown that the difficulty scaling versus the actual score earned actually increases in difficulty, and represent more of a inverse logistic curve, instead of simply a logistic once. Thus, while these linear ranges represent some sort of basic approximation for a logistic curve, this is actually the inverse of what is actually representative of the actual difficulty curve.
I (as well as Tom I assume) do not really have the time right now to completely recompute an more ideal curve for this scoring system, but this post is more to shed some light on this issue, so that someone else who has the time might be interested in actually fixing this and recomputing math.
coldloops
interesting.

I made a plot to show score x strain more clearly.

http://imgur.com/a/XsUp4

Also, just for fun, I'm comparing it to a cumulative normal distribution, I know it doesn't solve the difficulty scaling problem but at least it provides smooth progression over the whole range.

the red curve in the plot is pnorm(score/1000000, mean=0.65, sd=0.15)
code: https://gist.github.com/anonymous/0995d ... af7ccd5fbc
Topic Starter
Ciel
I guess when I said I had no time, I lied. Well, I decided to ditch my hw instead oops.

With help from Shoegazer, we tried coming up with an alternative way to compute strain instead. This follows a more exponential curve, instead of either a logistic or inverse logistic curve. This solves the issue of overweighting score ranges, especially past 700k, while valuing passes on difficult songs slightly more. However, this also leads to a general drop in PP gains overall.

For those that don't really care about the math, this is a sheet containing recomputed rankings for the top 500 players, as well as some basic examples as to how pp scales over the range of scores.

More Technical Details
Currently, I have kept the Accuracy Value unchanged, only changing the strain multiplier instead. While initially, the strain multiplier was the broken formula listed above, an exponential curve was applied to this multiplier instead.

(E^(score*exp/1000000)/(E^exp)-E^(-exp))/(1-E^(-exp))

This is the general formula for a exponential-like curve, which is scaled such that a score of 0 gives a multiplier of x0.00, and a score of 1m gives a multiplier of x1.00. Currently, I have tried setting this multiplier to 1.2, as it does not necessarily reward farming for accuracy, and also rewards simply passing fairly difficult maps.
Full Tablet
The problem with using the same Strain Multiplier curve for every map, is that not all maps have the same distribution of difficulty in their notes.

Example:
Map A: 3000 notes. About 95% of the map is very easy, but about 5% of the map are huge difficulty spikes. Star rating is 5.5
Map B: 3000 notes. The whole map is hard, but no part of it is as hard as the difficulty spikes of Map A. Star rating is also 5.5

If both give the same pp for the same score, then lower scores in Map A would be overrated, while high scores would be underrated, compared to the same scores in Map B.

The solution is not using a single value for rating the difficulty of a map, instead, using a Difficulty Curve (a function of score) that describes how hard is to get a specific score in a map; and calculate the pp of a play based on the difficulty value at the score achieved.
Topic Starter
Ciel

Full Tablet wrote:

stuff
While the stuff you do state is true, this is intended to be a bandaid fix for the current system, which unfortunately only provides a star rating that can truly be used in difficulty calculations.

In addition, while it would be nice to have a difficulty curve function for the entire score range, I don't think that is entirely possible due to the fact that it would bloat the database which is being used to compute these things. However, don't quote me on this.
Please sign in to reply.

New reply