osu!mania ScoreV2 live!

posted
Total Posts
476
show more
Redon
FI is a stupid idea and needs to be removed completely
FL and HD need to simply not influence score or pp at all because they are purely a question of player preference.
HD should be changed into a customizable lane cover that can either be static or grow in either direction, replacing both HD and FI.
There, I solved it all for you.
Full Tablet
The previous formulas I proposed (based on fitting the timings of the play to a normal distribution) become, simpler, more accurate, and faster to calculate if one uses the exact timing of each hit instead of the judgment counts.

How would the scoring system work:
1) Take the exact error of each note that was hit. For LN releases, divide it's error by x1.5 (to account for the fact that it's harder to time releases than hits), the multiplier could be adjusted to other value. Do not consider hits/releases that were hit in their "Miss" timing window.

Take the sum of the squares of those errors (that value will be referred as "s"), and the count of the hits and releases that weren't misses (referred as "k").

2) Count the amount of misses notes and releases (referred as "m").

3) With that information, calculate the Normal Distribution with zero mean that fits the data the best, obtaining the standard error Sigma (details of the calculation below).

4) Use a scaling function that maps that standard error to a score. A good choice for this function is, for Sigma measured in ms:

Erf is the Error Function. In the case Sigma is zero (which is only possible with a perfect play, which should be almost certainly impossible), then the score is 1 million.

For EZ/HT, for balance, it would be best if they don't change the timing window for 50s nor the timing window for Misses. That way, they can't have an effect on score, they become merely cosmetic (changing the distribution of the judgments during the play). DT/HT shouldn't change the timing window of 50s and Misses either (besides scaling to make internal clocks match real time, like it is done currently). To make scores with different ODs in the same map be directly comparable, those timing windows shouldn't change either (this way, OD becomes merely cosmetic as well while playing).

The variables are:
s = Sum of the squares of the timing errors of hits/releases that weren't a Miss. (Releases with errors scaled by a constant)
k = Count of hits/releases that weren't a Miss.
m = Count of Misses.
T = Upper limit of the timing window for a 50, in ms.

Case with no Misses
Sigma is simply

Case with Misses
Sigma is the positive number that solves the equation:

The equation doesn't have a simple closed form solution, but it can be easily solved numerically with Newton's Method (since the function to find a root for is convex and monotonic).
A first approximation of Sigma (that always is smaller than the real solution) is:


With BS = T/Sigma.
Function to find the root for:
It's derivative:

Note: Because of numerical errors while using double floating-point numbers to calculate the functions, for high values of BS, it's more accurate to use a series expansion near Sigma=0, instead of attempting to calculate their values with the exact formulas.
If BS > 7, then:
If BS >20, then:

The, starting with the initial approximation for sigma, iterate with Newton's Method until F[Sigma] is small:
Sigma[n+1] = Sigma[n] - F[Sigma[n]] / F'[Sigma[n]]

Testing this algorithm, it takes about 0.8ms in average to find an accurate value for Sigma (with an absolute error of 10^(-7)).
Kempie
That looks pretty impressive, I'll have to read into it later though (currently at work).

As good as it might be, I do think you're wasting your time. Quoting from Smoogipoo's askfm:

Anonymous questioner wrote:

Will you keep trying with different versions of scorev2 for mania? The current iteration is more a compromise between what players want, and what the game developers think it should be, instead of what players really want with the score system.
Answered ~4 months ago.

smoogi~ wrote:

Yeah. I've just been busy the past week with exams and will continue to be busy in the coming week with the same...
But let me address something. Combo _will_ remain regardless of iteration. It is not going anywhere. So if "what players really want" is for scoring to be an accuracy-only model, forget it.
Likewise, if players want us to copy another game's scoring system, forget it. If players want us to make a massively complex scoring system that takes into account difficult in sections of the maps and/or requires careful analysis of the timing distributions of hits, forget it.
But why, why will combo remain? You HAVE to realize that ScoreV2 is going to be used for MWC, and there are other aspects to consider in such an environment to make gameplay more exciting and to really show off the best-of-the-best. I've explained this before on reddit/the forums.
The scoring system must be easily able to be changed/recomputed and must be easy to use for _all_ other modes with minimal to no modification. Yes, this is "what the developers want", because we want to be able to re-balance the meta easily in the future.
http://ask.fm/smoogipooo/answers/138940251287

So basically; calm down and suck up to scorevcombo
ReTLoM
Any news cause 7k MWC isnt far anymore
Kempie

ReTLoM wrote:

Any news cause 7k MWC isnt far anymore
I'd like to know as well.
Halogen-
So, gonna start this train now: adjust LN life weighting so that heads/tails are 0.5x of normal, as a miss of a head almost certainly means you're missing a tail and you shouldn't be penalized twice as harshly for missing a LN than you would a normal tap, haha

this incorrect life weighting caused a *lot* of fails in 4K MWC because LN's were mathematically twice as harsh in penalty.
Shoegazer

Shoegazer wrote:

Rainbow Accuracy

Shoegazer wrote:

320s are very much underweighted because the only component of the scoring system that takes into account 320 accuracy is the combo component, which only has a 20% prominence. Add on to the fact that the difference between a 300 and 320 is so small and that the absolute difference between juan and Hudo's 320 count isn't that significant, it would make sense that 320s are really underweighted at the moment.

You could mitigate this by including 300gs into accuracy, but from what I've experimented it might create too much emphasis on MAX accuracy with charts that players have issues getting 96%+ on (and as a result would not be an accurate assessment of skill).

Alternatively, you can avoid including MAXes in the accuracy component and just increase the importance of MAXes to like 360 to increase the emphasis of it by a noticeable but not overpowering amount in the combo component, but that requires a bit more experimentation.
I initially wanted to increase the rainbow judgement weightage without embedding rainbows into accuracy, but no matter how much I changed it, the difference is very minor (~600-1,200 points) and a 200 will almost always be too powerful compared to a rainbow 300. So I scrapped that idea and thought that embedding rainbows into accuracy with a reasonable weightage and maybe making the curve more lenient would be the best idea.

I've been experimenting with weightages and discussing with people about how much a 200 should be worth compared to a 300. I initially thought that 310 would be fine (and a 200 would be worth 11 300s), but when it came to matches like this, if accuracy was the only factor, Argentina would win by 21,000 points. I do think that Argentina should win and it's a step in the right direction, but 21,000 seems extremely overwhelming since it undermines the fact that Poland had overall, noticeably less 200s. I tried it with harder charts too and they seem to favour rainbow accuracy a little too much for my liking - especially since when it comes to harder charts (where players struggle with), good rainbow accuracy is usually caused by variance rather than a higher skill level. 200s and worse judgements should determine performance for that.

I wanted to use 307 afterwards, but it still gave a bit too much emphasis for my liking, about 12,500 points for that Argentina/Poland match. I went down to 305, and the difference is about 6,800. I think that's ultimately the most reasonable assessment, and others I've talked to seem to agree with the prenotion that a 200 is about 21 normal 300s. Ignoring the bad judgements (since those values are pretty much set in stone at this point), this is probably (part of) the ideal solution. This does mean that only full rainbow scores are SSs, but I don't see that as a problem as frames of reference can be shifted.

Getting rid of the difference between a rainbow and a normal 300 in the combo scoring component is probably ideal too, since that should be in the accuracy component, not the combo component. If rainbows are included into accuracy, the combo component does not need a rainbow component.

I also wanted to soften the exponential curve a tiny bit when it comes to including rainbows, mainly because at a certain point extremely good accuracy is more caused by variance rather than a very high skill level - unless the performance is consistently done, which is not measurable with just one match and one attempt. The exponential I had in mind was Accuracy^(2 + 2 * Accuracy), but it's essentially Accuracy^4 - so 1 power down.

tl;dr: Embed rainbows into accuracy with a weightage of 305 instead of 320, change the accuracy curve to Accuracy^(2 + Accuracy * 2), remove the differentiation between rainbows and normal 300s in the combo component (both of them should have a HitValue of 30).
Reposting this as well; considering that rainbow accuracy is a major component in assessing skill, I think embedding higher emphasis in rainbow accuracy will make the scorev2 system more accurate when it comes to assessing skill.
Remyria
I don't think making SS max scores only is a good idea, with how hard it is. the accuracy needed for hitting only 300g is at a completely other level than any other mode. I gueninely hope it's not planned for osu!lazer and is just a tournament thing
Topic Starter
smoogipoo
Who cares, this is only for tournament/MWC for now. Ideally come osu!next if this is the best path forward, osu!mania should have an SSS ranking.
Kamikaze

smoogipooo wrote:

Who cares, this is only for tournament/MWC for now. Ideally come osu!next if this is the best path forward, osu!mania should have an SSS ranking.
That would be amazing actually.
Remyria

smoogipooo wrote:

Who cares, this is only for tournament/MWC for now. Ideally come osu!next if this is the best path forward, osu!mania should have an SSS ranking.
if there's such thing as an SSS or SS+, I can already see the achievement's description "Beyond perfection."
PsychicScribble
Well, I'm gonna be screwed as soon as this launches.
adamdino123
;-; [*] R.I.P
Superluminal

Redon wrote:

FI is a stupid idea and needs to be removed completely
FL and HD need to simply not influence score or pp at all because they are purely a question of player preference.
HD should be changed into a customizable lane cover that can either be static or grow in either direction, replacing both HD and FI.
There, I solved it all for you.
Wouldn't it be something if they got rid of FL and kept HD and FI (Which should be called sudden) but put FI where FL was such that if you had both enabled it would function as FL
LastExceed
Now that a non-shiny 300 doesn't give 100% acc anymore, does that mean an SS in scoreV2 is as rare as a million in scoreV1 ? that imagination really doesn't feel right...

Tachyon wrote:

Wouldn't it be something if they got rid of FL and kept HD and FI (Which should be called sudden) but put FI where FL was such that if you had both on it would function as FL
FI + HD =/= FL
The big differences are the the fact that FL doesn't scale and that it covers the entire screen while the FI/HD shadow only covers the stage.

Redon wrote:

FI is a stupid idea and needs to be removed completely
FL and HD need to simply not influence score or pp at all because they are purely a question of player preference.
HD should be changed into a customizable lane cover that can either be static or grow in either direction, replacing both HD and FI.
There, I solved it all for you.
being the FI guy I feel like its my duty to say this: "DUN DELET FAD-EN!1!!11one!"
no srsly FI can be really fun, there's no reason to remove it.
Yas
Found a small issue with SV2. When you initially play a file, you are given a different accuracy percent than you get after reloading osu.
Perhaps osu calculates sv2 mod scores with sv1 after a reload of osu.

This was the screenshot I took right after playing a file.

This was a screenshot of the same play, same score, but the accuracy is markedly higher.
Cuber

LastExceed wrote:

Now that a non-shiny 300 doesn't give 100% acc anymore, does that mean an SS in scoreV2 is as rare as a million in scoreV1 ? that imagination really doesn't feel right...
The only reason it doesn't feel right is because you are used to the current system. Having a judgement below the highest one negatively impact accuracy makes way more sense if you get out of the old frame of mind.

edit: confusing terminology
LastExceed

Cuber wrote:

The only reason it doesn't feel right is because you are used to the current system. Having a judgement below the highest one negatively impact accuracy makes way more sense if you get out of the old frame of mind.
edit: confusing terminology
I don't think thats the issue here. Im completely fine with all scores dropping little and S ranks becoming harder when scoreV2 goes live because i know that relatively it stays the same (I just need to get used to the new standards) but making a whole rank a once-in-a-lifetime experience is like turning it into an achievement. Imagine how the user profiles would look like, most people would have 0 SS ranks. Further more: mania is the mode with the easiest S ranks in osu!. With scoreV2 as it is it would get the hardest SS ranks which is quite a contrast.
Full Tablet
It's a good thing that non-shiny 300s do not give 100% accuracy. When they give 100%, the acc% value becomes an imprecise measure of accuracy at high accuracy levels (for example, there is a big difference between a SS with 1:3 300:300g ratio, and a SS with 1:10 300:300g ratio).

A better solution for the problem of SSs being too rare, is changing the requirements for a SS.
LastExceed

Full Tablet wrote:

It's a good thing that non-shiny 300s do not give 100% accuracy. When they give 100%, the acc% value becomes an imprecise measure of accuracy at high accuracy levels (for example, there is a big difference between a SS with 1:3 300:300g ratio, and a SS with 1:10 300:300g ratio).

A better solution for the problem of SSs being too rare, is changing the requirements for a SS.
Thats true. Time to bring the SSS rank here
show more
Please sign in to reply.

New reply