osu!mania ScoreV2 live!

posted
Total Posts
476
show more
Kamikaze
This is actually not late at all, we should be slowly starting to get back into discussions about v2 because 7K MWC is right after OWC which is about to start.
LastExceed
Just seems like no one cares anymore...
juankristal

-Kamikaze- wrote:

This is actually not late at all, we should be slowly starting to get back into discussions about v2 because 7K MWC is right after OWC which is about to start.
Sure thing, it is still a long way to go but same thing we said last time.

The point here is that in order to advance is not that we really need much more stuff to talk about. We saw results in the MWC4K and what we can do is just post solutions to the problem (even tho this has been done already before and we all know what happened). Hopefully those points will be used for this 7K World Cup.
LastExceed
still need an answer to my question though
why does FlashLight give a multiplier but FadeIn doesn't when there are some people who perform better with FL than NoMod but not a single one in the world who performs better with FI than NoMod? (I can tell by the "global rankings with active mods" that I am the only one in the world who uses FI for topscores, and even I only do it because it's a fun challenge, it's actually a handicap to my performance)
Kamikaze
you are automatically assuming that noone can do FI just as good or better than nomod, so that's already a dumb point. It's all a matter of prefference, for example Tidek can do even better on FI than on nomod on some maps (he A'd blastix riotz with FI for example tho not pb)
LastExceed

-Kamikaze- wrote:

you are automatically assuming that noone can do FI just as good or better than nomod, so that's already a dumb point. It's all a matter of prefference, for example Tidek can do even better on FI than on nomod on some maps (he A'd blastix riotz with FI for example tho not pb)
I can hardly believe that it increases his performance and he just doesn't want to use it.
It's probably only because he can't reach a high combo on that map (as you said it's "just" an A) so the shadow didn't go very low
I'm gonna ask him that personally to clear this up. Any other potential FI players you know of? I am always searching

But even if there are a few players who can play better with FI and they just don't want to, the amount is still way smaller than FL players so the point persists.
Kamikaze
that would not be the case fyi if FI worked as it actually should work. at the moment the cover goes way too low, making it reaaaaaaaaally hard to read on default hitposition and that's pretty much the only reason why a lot of people tend to make skinned lanecovers instead. a lot of people in BMS, IIDX and even here use a lanecover that covers the screen in the same way as FI, but without the effect of it "growing" and it does help them a lot. some examples:



LastExceed
yeah I am aware of people using lane covers, I always thought that the difficulty of FI lays in its crazy scaling and I wish that I could get some sort of reward for handicapping myself with it one day, especially now that combo is getting some value in mania...
Kempie

-Kamikaze- wrote:

.... a lot of people in BMS, IIDX and even here use a lanecover that covers the screen in the same way as FI, but without the effect of it "growing" and it does help them a lot....
The growing effect is exactly what makes FI a whole lot harder. Getting good with FI means:
  1. You have to get used to the ridiculously shallow visibility maxed out FI provides.
  2. You have to deal with misses fucking up FI's shadow.


There's a good reason why plenty of people skin in a lane cover, but nobody seriously uses FI.
Kamikaze
I know, and this is kinda what I mean, FI's design is flawed whether it's "growing" too far, from too high starting point or "growing" at all.

I have heard before that with v2 optimizations for next MWC there can be changes in mods, so I would suggest making both hidden and fadein either grow but from much higher point and not as far up/down or just be stable at like 2/3 up/down of the way from your hitposition to the edge of the screen and (this has to be done to make FI even semi viable) fixate cover of FI/HD to hitposition.

This is the main reason why FI is so hard to most, you can deal with the growing in various ways (like even putting a shirt on monitor) but it grows too far down to be able to handle dense patterns since you have to drastically lower your speedmod.

Talking about mods and score multipliers - after watching the last MWC I do think that both HR and FL/FI/HD should actually have the same multiplier and have the multiplier be 1,06x since up until finals (and even at that stage there were maps where this applied) using HR on freemods was nearly always a net positive. It was actually quite hard to get a lower score using HR than you would using nomod, because the score buffer on HR was more than enough to cover for lower acc. HR should be a risky mod for those who are REALLY confident they can do well on a map. That multiplier also made visual mods (FL there, but I'm talking about all for the sake of the future) unviable unless you were doing HR already, due to the lower multiplier. There should be multiple tactical avenues you can take on that and FL vs HR for example should be a good one.

I am aware that there are players that can do better with FL/HD/whatever than nomod, but that kinda eliminates them from nomod bracket, so it's another wrinkle on the player picking strategy for the capains which is nice.

Also there is a biiiig problem as explained before with LN drain being too high (both start and end drain the same amount of hp as normal notes which is a killer, imo it should be about 70% for the start and 30% for the release), MAX:300 ratio being underrated as fuck:

Kempie wrote:

Since there's not much left to discuss about ScoreV2, I'll just post this interesting MWC match to remind the devs on the severity of MAX's being underrated:



Other scores left out for brevity.
Ciel's post explains some of the issues as well: link for convenience sake.
LastExceed
I have heard before that with v2 optimizations for next MWC there can be changes in mods
this would be some good news


I would suggest making both hidden and fadein either grow but from much higher point and not as far up/down or just be stable at like 2/3 up/down of the way from your hitposition to the edge of the screen
making them smaller would result in even more people using them out of preference, and a fixed shadow would be boring (i find the scaling very fun so we should keep that. Yiu're right, somethng should change, but not like this.

fixate cover of FI/HD to hitposition
yeah this makes sense

you have to drastically lower your speedmod.
you're overseeing something important here: yes you do have to slow down, but only as much as people like me (who play with low speed by default) have to speed up in order to play hidden. To me HD is as hard as FI is to you. This is also why i think the maximum scale doesn't necessarily have to be changed that much, if at all.

Talking about mods and score multipliers (...)
All i can say here is that HR is too high, because even I (who has a horrible acc) get better scores with it. Balancing HR and visual mods would only be useful for tournaments, and I am not sure if that's really worth it.
I have no feeling for score or hp values because i have too few experience with v2 so i can't give an opinion on that.

Kamikaze is a nice guy. Although our opinions almost ALWAYS differ, its nice to have him in debates like these because he brings the discussion forwards alot by always staying on topic and serving good arguments for everything. A rare ability I really appreciate.
Kamikaze

LastExceed wrote:

I would suggest making both hidden and fadein either grow but from much higher point and not as far up/down or just be stable at like 2/3 up/down of the way from your hitposition to the edge of the screen
making them smaller would result in even more people using them out of preference, and a fixed shadow would be boring (i find the scaling very fun so we should keep that. Yiu're right, somethng should change, but not like this.
Bolded the sentence on purpose - the fact that you find scaling fun does not mean that it should be the way it's done. It's just personal bias. Also the way I see it - the initial cover should be larger for both FI/HD so it's not that much of a mindfuck

you have to drastically lower your speedmod.
you're overseeing something important here: yes you do have to slow down, but only as much as people like me (who play with low speed by default) have to speed up in order to play hidden. To me HD is as hard as FI is to you. This is also why i think the maximum scale doesn't necessarily have to be changed that much, if at all.
That is actually not true. I play at a relatively average speedmod (25-26) and I can read hidden fine with that speed (or 1 higher), while for FI on default hitposition I have to use speed 9. Tidek as another example usually plays on speed around 22-23 (iirc) and he also took down his to 11 with lower hitposition while he learned how to do hidden before it was removed from mod pool for the last mwc and I'm pretty sure that he didn't up the scroll speed too much if at all. You also missed the point that you have to drastically decrease it and that the field of vision is too low with default hitposition, and I believe that if the cover area is fixated not to the edge of the screen but to the hitposition the field of vision will already be a decent bit bigger.

Talking about mods and score multipliers (...)
All i can say here is that HR is too high, because even I (who has a horrible acc) get better scores with it. Balancing HR and visual mods would only be useful for tournaments, and I am not sure if that's really worth it.
I have no feeling for score or hp values because i have too few experience with v2 so i can't give an opinion on that.
Even if it's only for tournament purposes it's still hella worth it, you would like to see a fair tournament with interesting scoring and mod mechanics more than one with broken ones would you?

Kamikaze is a nice guy. Although our opinions almost ALWAYS differ, its nice to have him in debates like these because he brings the discussion forwards alot by always staying on topic and serving good arguments for everything. A rare ability I really appreciate.
oh, thanks haha I appreciate that
Redon
FI is a stupid idea and needs to be removed completely
FL and HD need to simply not influence score or pp at all because they are purely a question of player preference.
HD should be changed into a customizable lane cover that can either be static or grow in either direction, replacing both HD and FI.
There, I solved it all for you.
Full Tablet
The previous formulas I proposed (based on fitting the timings of the play to a normal distribution) become, simpler, more accurate, and faster to calculate if one uses the exact timing of each hit instead of the judgment counts.

How would the scoring system work:
1) Take the exact error of each note that was hit. For LN releases, divide it's error by x1.5 (to account for the fact that it's harder to time releases than hits), the multiplier could be adjusted to other value. Do not consider hits/releases that were hit in their "Miss" timing window.

Take the sum of the squares of those errors (that value will be referred as "s"), and the count of the hits and releases that weren't misses (referred as "k").

2) Count the amount of misses notes and releases (referred as "m").

3) With that information, calculate the Normal Distribution with zero mean that fits the data the best, obtaining the standard error Sigma (details of the calculation below).

4) Use a scaling function that maps that standard error to a score. A good choice for this function is, for Sigma measured in ms:

Erf is the Error Function. In the case Sigma is zero (which is only possible with a perfect play, which should be almost certainly impossible), then the score is 1 million.

For EZ/HT, for balance, it would be best if they don't change the timing window for 50s nor the timing window for Misses. That way, they can't have an effect on score, they become merely cosmetic (changing the distribution of the judgments during the play). DT/HT shouldn't change the timing window of 50s and Misses either (besides scaling to make internal clocks match real time, like it is done currently). To make scores with different ODs in the same map be directly comparable, those timing windows shouldn't change either (this way, OD becomes merely cosmetic as well while playing).

The variables are:
s = Sum of the squares of the timing errors of hits/releases that weren't a Miss. (Releases with errors scaled by a constant)
k = Count of hits/releases that weren't a Miss.
m = Count of Misses.
T = Upper limit of the timing window for a 50, in ms.

Case with no Misses
Sigma is simply

Case with Misses
Sigma is the positive number that solves the equation:

The equation doesn't have a simple closed form solution, but it can be easily solved numerically with Newton's Method (since the function to find a root for is convex and monotonic).
A first approximation of Sigma (that always is smaller than the real solution) is:


With BS = T/Sigma.
Function to find the root for:
It's derivative:

Note: Because of numerical errors while using double floating-point numbers to calculate the functions, for high values of BS, it's more accurate to use a series expansion near Sigma=0, instead of attempting to calculate their values with the exact formulas.
If BS > 7, then:
If BS >20, then:

The, starting with the initial approximation for sigma, iterate with Newton's Method until F[Sigma] is small:
Sigma[n+1] = Sigma[n] - F[Sigma[n]] / F'[Sigma[n]]

Testing this algorithm, it takes about 0.8ms in average to find an accurate value for Sigma (with an absolute error of 10^(-7)).
Kempie
That looks pretty impressive, I'll have to read into it later though (currently at work).

As good as it might be, I do think you're wasting your time. Quoting from Smoogipoo's askfm:

Anonymous questioner wrote:

Will you keep trying with different versions of scorev2 for mania? The current iteration is more a compromise between what players want, and what the game developers think it should be, instead of what players really want with the score system.
Answered ~4 months ago.

smoogi~ wrote:

Yeah. I've just been busy the past week with exams and will continue to be busy in the coming week with the same...
But let me address something. Combo _will_ remain regardless of iteration. It is not going anywhere. So if "what players really want" is for scoring to be an accuracy-only model, forget it.
Likewise, if players want us to copy another game's scoring system, forget it. If players want us to make a massively complex scoring system that takes into account difficult in sections of the maps and/or requires careful analysis of the timing distributions of hits, forget it.
But why, why will combo remain? You HAVE to realize that ScoreV2 is going to be used for MWC, and there are other aspects to consider in such an environment to make gameplay more exciting and to really show off the best-of-the-best. I've explained this before on reddit/the forums.
The scoring system must be easily able to be changed/recomputed and must be easy to use for _all_ other modes with minimal to no modification. Yes, this is "what the developers want", because we want to be able to re-balance the meta easily in the future.
http://ask.fm/smoogipooo/answers/138940251287

So basically; calm down and suck up to scorevcombo
ReTLoM
Any news cause 7k MWC isnt far anymore
Kempie

ReTLoM wrote:

Any news cause 7k MWC isnt far anymore
I'd like to know as well.
Halogen-
So, gonna start this train now: adjust LN life weighting so that heads/tails are 0.5x of normal, as a miss of a head almost certainly means you're missing a tail and you shouldn't be penalized twice as harshly for missing a LN than you would a normal tap, haha

this incorrect life weighting caused a *lot* of fails in 4K MWC because LN's were mathematically twice as harsh in penalty.
Shoegazer

Shoegazer wrote:

Rainbow Accuracy

Shoegazer wrote:

320s are very much underweighted because the only component of the scoring system that takes into account 320 accuracy is the combo component, which only has a 20% prominence. Add on to the fact that the difference between a 300 and 320 is so small and that the absolute difference between juan and Hudo's 320 count isn't that significant, it would make sense that 320s are really underweighted at the moment.

You could mitigate this by including 300gs into accuracy, but from what I've experimented it might create too much emphasis on MAX accuracy with charts that players have issues getting 96%+ on (and as a result would not be an accurate assessment of skill).

Alternatively, you can avoid including MAXes in the accuracy component and just increase the importance of MAXes to like 360 to increase the emphasis of it by a noticeable but not overpowering amount in the combo component, but that requires a bit more experimentation.
I initially wanted to increase the rainbow judgement weightage without embedding rainbows into accuracy, but no matter how much I changed it, the difference is very minor (~600-1,200 points) and a 200 will almost always be too powerful compared to a rainbow 300. So I scrapped that idea and thought that embedding rainbows into accuracy with a reasonable weightage and maybe making the curve more lenient would be the best idea.

I've been experimenting with weightages and discussing with people about how much a 200 should be worth compared to a 300. I initially thought that 310 would be fine (and a 200 would be worth 11 300s), but when it came to matches like this, if accuracy was the only factor, Argentina would win by 21,000 points. I do think that Argentina should win and it's a step in the right direction, but 21,000 seems extremely overwhelming since it undermines the fact that Poland had overall, noticeably less 200s. I tried it with harder charts too and they seem to favour rainbow accuracy a little too much for my liking - especially since when it comes to harder charts (where players struggle with), good rainbow accuracy is usually caused by variance rather than a higher skill level. 200s and worse judgements should determine performance for that.

I wanted to use 307 afterwards, but it still gave a bit too much emphasis for my liking, about 12,500 points for that Argentina/Poland match. I went down to 305, and the difference is about 6,800. I think that's ultimately the most reasonable assessment, and others I've talked to seem to agree with the prenotion that a 200 is about 21 normal 300s. Ignoring the bad judgements (since those values are pretty much set in stone at this point), this is probably (part of) the ideal solution. This does mean that only full rainbow scores are SSs, but I don't see that as a problem as frames of reference can be shifted.

Getting rid of the difference between a rainbow and a normal 300 in the combo scoring component is probably ideal too, since that should be in the accuracy component, not the combo component. If rainbows are included into accuracy, the combo component does not need a rainbow component.

I also wanted to soften the exponential curve a tiny bit when it comes to including rainbows, mainly because at a certain point extremely good accuracy is more caused by variance rather than a very high skill level - unless the performance is consistently done, which is not measurable with just one match and one attempt. The exponential I had in mind was Accuracy^(2 + 2 * Accuracy), but it's essentially Accuracy^4 - so 1 power down.

tl;dr: Embed rainbows into accuracy with a weightage of 305 instead of 320, change the accuracy curve to Accuracy^(2 + Accuracy * 2), remove the differentiation between rainbows and normal 300s in the combo component (both of them should have a HitValue of 30).
Reposting this as well; considering that rainbow accuracy is a major component in assessing skill, I think embedding higher emphasis in rainbow accuracy will make the scorev2 system more accurate when it comes to assessing skill.
Remyria
I don't think making SS max scores only is a good idea, with how hard it is. the accuracy needed for hitting only 300g is at a completely other level than any other mode. I gueninely hope it's not planned for osu!lazer and is just a tournament thing
show more
Please sign in to reply.

New reply