forum

Performance Points feedback and suggestions (Standard)

posted
Total Posts
2,750
show more
Full Tablet

jesus1412 wrote:

The reason I suggested this is because star rating already finds the most difficult parts of the map, at least to my knowledge. Toms difficult calculator could generate graphs of the difficulty at certain times, hence why I thought the idea was feasible. Here's a picture of one of the graphs in question:

If you had a list with the difficulty of each combo increment in the map (made with the strain graphs: Circles, Spinners and Slider Starts would have a value equal to their respective strains, while slider ticks and ends would have a fraction of their respective slider strains), you could find the easiest section of the map that gives X combo (by considering the "easiest section" as the section with the smaller sum of strains, it might be a good idea to fine-tune the scaling of the strains if using that criteria, by, for example, using a list of each strain to the power of 1.5, in order to increase the relative worth of the hardest parts)

For example: In a map with maximum 100,000 combo with the speed strain like this (I just used a random number generator and a Gaussian filter to generate the graph, with many points to test how fast the computer can find those values):

The easiest sections that give 10%, 20%, 30%, ..., 90% of the max combo would be:

(Calculating and generating those graphics took 3.447622 seconds, in maps with smaller max combo it would take considerably less)
Third Example
With another strain graph
http://a.pomf.se/pkjgav.mp4
http://a.pomf.se/panvrz.mp4 The same strain graph, but before applying the algorithm, the strain values were squared, making it so the hardest parts are weighted more. Ideally, the criteria should be "the section that would have the lowest star rating", but because of the way it is calculated (weighted sum based on rank of the strain values), it would severely increase the calculation time
Finding those sections could be useful for determining the worth of getting certain amount of max combo in the maps when calculating pp. For example, the difficulty factor could be related to the star difficulty of the section that was determined to be the easiest instead of the overall star difficulty (and, to compensate, reduce the penalty for misses and combo lost). This would balance the amount of pp given in maps where getting 95% of the max combo is relatively easy, since the hardest part is just at the end on the map, for example; or, maps where the hard parts are just in the middle, where getting 75% of the max combo (caused by a random mistake, after doing well on the hard parts) is undervalued.

The main issue would having to calculate the star difficulty for each play (which could potentially be very expensive for the servers), for that, instead of calculating after each play, calculate several star difficulties per map varying the combo and interpolate the star difficulties between the pre-calculated values for each play (this would increase considerably the amount of time it takes to recalculate values each algorithm change, though).
ivan
x
Full Tablet
About the previous post, it had some issues:
- The algorithm didn't always find the sections with smaller star rating (it did most of the time when squaring the strain values, but not always), changing it so it always finds the correct sections would increase the calculation time considerably.
- In maps where the difficulty is near constant, the star rating of a section of the map is too close to the star rating of the whole map (so in those maps, the difference between ~30% and 80% of the max combo would be small). Similarly, if a map has 2 difficulty spikes of the same difficulty, the star rating of the sections with 1 difficulty spike is similar to the star rating of the section with both difficulty spikes.

So, there would be changes to fix those issues:
- Take the strain values of the combo increments, and weight them according to their rank (the highest value has 100% of it's value, the second 99%, the third .99^2, and so on) (in that case the sum of those weighted values is related to the star ranking of the map). Find the easiest sections with criteria of "smallest sum" of those weighted values.
- Instead of calculating the star rating of the sections found, just take the sum of the weighted strain values in the section found (weighted according to the rank of the combo increments in the whole map, not just the section), and give the combo factor in pp based on that.
Examples:
Graphs
Map with the hardest section in the middle:

Hardest part towards the end:

2 Difficult Parts

Near constant difficulty

Perfectly constant difficulty: this is a pathological case, the result here was caused because the ranking of strains of the same value was done in canonical order: first element 1st, the second element with rank 2nd, and so on. Actually, since all values are the same, any ranking makes sense, and each way of ranking generates a different combo curve. The best way to solve this would ranking equal values with the same rank, with a rank where the sum of the weightings of the tied strains is the same: Example: 4 strains tied for first place; the fifth element would be ranked 5th, while the first 4 elements are ranked (with u=4, so each rank is ~2.49372); if 6th and 7th are tied as well, then their ranks are 5+(previous expression with u=2)=6.49874.
This would make the curve linear (and more similar to the "near constant" difficulty curve)

Fixed combo factor curve (each strain value is ranked 230.1095838724)
How the combo factor looks currently for comparison (the shape is what is important, all curve values should be multiplied so the max value is 1)
silmarilen
this wont work because it cant see where you missed. if you got 700 combo on a map, hitting the hardest part but getting a random miss at such a point that you did not get the smallest possible combo that you had 100% fc'd the hardest part (lets say the hardest part is at 720 combo into the map, and going on for 20 notes, meaning to get the most out of it you would have to have about 750 combo) you would not get the full difficulty bonus, even tho you hit the hardest part, only because your combo was 50 short.
making 50 combo make such a huge difference when you did the same in terms of fcing the difficult part is not how it should work.
Full Tablet

silmarilen wrote:

this wont work because it cant see where you missed. if you got 700 combo on a map, hitting the hardest part but getting a random miss at such a point that you did not get the smallest possible combo that you had 100% fc'd the hardest part (lets say the hardest part is at 720 combo into the map, and going on for 20 notes, meaning to get the most out of it you would have to have about 750 combo) you would not get the full difficulty bonus, even tho you hit the hardest part, only because your combo was 50 short.
making 50 combo make such a huge difference when you did the same in terms of fcing is not how it should work.
With the data available for the pp calculation, it is safer to assume the combo breaks were present in the hardest parts of the songs (and give a high combo factor if it can be sure you got the hardest part of the map right). After all, most of the time the combo breaks are present in the hardest part of the map (with random combo breaks being the exception, except in maps that are too long or is hard to keep concentration).
Vuelo Eluko
but how does the system know where in the combo the hard parts are?
i dont think it does.
silmarilen

Full Tablet wrote:

silmarilen wrote:

this wont work because it cant see where you missed. if you got 700 combo on a map, hitting the hardest part but getting a random miss at such a point that you did not get the smallest possible combo that you had 100% fc'd the hardest part (lets say the hardest part is at 720 combo into the map, and going on for 20 notes, meaning to get the most out of it you would have to have about 750 combo) you would not get the full difficulty bonus, even tho you hit the hardest part, only because your combo was 50 short.
making 50 combo make such a huge difference when you did the same in terms of fcing is not how it should work.
With the data available for the pp calculation, it is safer to assume the combo breaks were present in the hardest parts of the songs (and give a high combo factor if it can be sure you got the hardest part of the map right). After all, most of the time the combo breaks are present in the hardest part of the map (with random combo breaks being the exception, except in maps that are too long or is hard to keep concentration).
but then you are just assuming things. you are not fixing a problem, you are just shifting the problem somewhere else.
Full Tablet

silmarilen wrote:

but then you are just assuming things. you are not fixing a problem, you are just shifting the problem somewhere else.
Well, fixing the issue you mention would require knowing where the misses are (or taking even more assumptions, for example: making a model that estimates the probability the misses were in random easy parts instead of the hardest parts of the map, which would increase the complexity of the calculation and could potentially give overrated pp values).
jesse1412

Riince wrote:

but how does the system know where in the combo the hard parts are?
i dont think it does.
I posted the graphs the old tp system uses. This already happens.

silmarilen wrote:

Full Tablet wrote:

With the data available for the pp calculation, it is safer to assume the combo breaks were present in the hardest parts of the songs (and give a high combo factor if it can be sure you got the hardest part of the map right). After all, most of the time the combo breaks are present in the hardest part of the map (with random combo breaks being the exception, except in maps that are too long or is hard to keep concentration).
but then you are just assuming things. you are not fixing a problem, you are just shifting the problem somewhere else.
Rewarding SOME people for hitting the hardest part is better than rewarding none, plus the second person genuinely had a better play because they full combod the hard part and even some extra parts compared to the other player. It's also safe to assume if someone can do the hardest part of the map without missing then they can full combo the rest (this is the case the majority of the time).

I'd say that it's better to reward as many performances as possible rather than denying some people their pp just because someone else COULD have FC'ed the same hard part.
Vuelo Eluko
i see, the graphs just had time so i didnt know if it had specific enough information to narrow it down to specific objects/combo #
Drezi

silmarilen wrote:

but then you are just assuming things. you are not fixing a problem, you are just shifting the problem somewhere else.
I think the idea here does not intend to reduce the value of current scores under any circumstances, nor should it mean giving small combo scores near full combo values.

Instead using this we could reward some non full combo performances adequately higher, than what they are currently worth with the (your combo^08/total combo^08) weighting, WHEN we can be sure that even in the worst case scenario (easiest consecutive "your combo length" section) includes the hardest parts.

So this way there wouldn't be any negative aspects to this, only the upside that certain non FC performances could be rewarded higher, where we can be sure that it's justified.
Full Tablet

Drezi wrote:

silmarilen wrote:

but then you are just assuming things. you are not fixing a problem, you are just shifting the problem somewhere else.
I think the idea here does not intend to reduce the value of current scores under any circumstances, nor should it mean giving small combo scores near full combo values.
Well, comparing the proposed combo factor with the current one, maps that have the hardest part at the very end or very beginning relatively would get smaller combo factor for non-FC; but I don't think it is a bad thing.

For example, one of the reasons this map with DT gives so much pp for it's difficulty https://osu.ppy.sh/b/84811?m=0 is because the hardest part is at the very end, but the map is relatively easy, so you get a considerable amount of pp even if you failed the jumps at the end.
Drezi
Yeah, I don't think it would be a bad thing either, I always support changes for the better, but most people don't like bigger changes, and here the current weight could simply be bumped up when it's appropriate, so that shouldn't be problematic in any way.
Nyxa
Unrelated to the previous discussion (It's interesting though, and I support the idea) - How are doubles weighted in this system? Are they counted separately or as a two note stream? Because, if it's the latter, then they are heavily undervalued. I know for a fact that a majority of the players I talk to have a lot of difficulty with playing doubles, mostly because they find the rhythm odd and because the constant switching from blue to red/white polarity that comes with doubles (or 1/6 patterns which are even harder to understand for non-musically inclined people) confuses them as opposed to your regular 4/4 signature rhythms.

I think that, unless doubles have already been addressed separately, they should be. A map full of doubles (like Lan's diff in Yoiyami Hanabi or Tsukimiyo Rabbit) is quite difficult to get a high accuracy on, especially when the OD is up there and when the map is fast. I apologize if I missed any sections on them, but I feel like they're something that should be addressed.

And then maybe take a look at 1/3 type patterns as well, since those are also often confusing for a lot of players. I don't think the boost should be huge (and maybe it's already there) but it would be nice to know whether those are rewarded in some form of the other.
Drezi
I don't think there's such a thing as counting two notes separately or as a stream. Time and distance between notes is looked at afaik.

It's true that complex rythms are harder to acc though, not sure how well it is accounted for.
GhostFrog
How difficult a map is rhythmically isn't taken into consideration at all right now. Doubles are treated the same as any other notes and contribute to the strain values in the same way any other notes would based on their position and timing.

1/3 patterns aren't given any bonus and I don't think giving them a bonus would be a good thing - changing the listed bpm of a map would change which patterns are 1/3 without changing anything about gameplay. It would probably make sense to give a bit of a bonus when the notes change from 1/3 to 1/2 to 1/6 to 1/4 etc, but that also is currently not considered.
Drezi
That's a shame, when it comes to rythm the less repetitive it is, the harder.

I mean it's like anyone can hit a constant beat on a drum, but even a repeating pattern is harder to pull off..
Miku Maekawa

Drezi wrote:

I mean it's like anyone can hit a constant beat on a drum
if you told a random person to keep a steady, simple beat on a drum at some normal bpm

you'd be surprised at how many people would have the tendency to speed up drastically if they didnt have some sort of metronome to follow
Full Tablet

Drezi wrote:

That's a shame, when it comes to rythm the less repetitive it is, the harder.

I mean it's like anyone can hit a constant beat on a drum, but even a repeating pattern is harder to pull off..
Something like the algorithm here in tom94's ask.fm could be used http://pastebin.com/cFGUJdGa

It is for taiko, but could be used for standard too if the only variable of the objects is the time between hits, with only one color present, considering both circles and slider starts as the same kind of object. Sliders might be considered a little different (probably making sliders of a certain duration have a "partial" match with circles or sliders of different duration if both share the same time between key presses, where the partial match reduces the rhythm complexity strain less than a full match).

Using a weighting of the strains of 0.9975 (So the maximum value is 400):

"Rhythm Complexity"
xi - FREEDOM DiVE [FOUR DIMENSIONS]: 348.488
https://osu.ppy.sh/b/297463&m=0 351.973
https://osu.ppy.sh/b/312959&m=1 324.277
https://osu.ppy.sh/b/443272&m=0 271.207
https://osu.ppy.sh/b/323875&m=0 256.527
https://osu.ppy.sh/b/152078&m=1 369.495
https://osu.ppy.sh/b/58063&m=0 328.276
Drezi

Apink Chorong wrote:

if you told a random person to keep a steady, simple beat on a drum at some normal bpm

you'd be surprised at how many people would have the tendency to speed up drastically if they didnt have some sort of metronome to follow
But in osu you DO have a metronome - the music itself (also if you start speeding up, you start getting 100s), and it doesn't even matter, cause the point is that relatively a constant beat is still easier to pull off more or less accurately than harder patterns.

@Full Tablet: that looks pretty good.
ivan
x
Nyxa
Are you planning on posting anything constructive?

Anyway, Drezi sees my point here. Rhythm complexity matters a lot, if you have a 5-second section that's filled with spaced 1/4 sliders, it'll be easier to get high accuracy on than on an equally long section of the same BPM with various triplets, doubles and streams. I also think that polarity shifts should be taken into account, and 1/3 rhythms should receive some attention of their own (though it would obviously depend on the map speed and difficulty how much of a bonus this would give). I think if you take rhythm complexity + jesus' idea of measuring how well you did based on the minimum max combo required to have FCd the hardest part of the map, you will already be a lot closer to accurately measuring + rewarding a map based on it's difficult. Per-hitobject data might be easier, but since that's not currently an option, there's nothing wrong with finding viable alternatives that would still be better/more accurate than the current system.

Also, I would really like to see a change in the weightings as mentioned a while ago. I'd been thinking of that and Drezi's idea of having it taper off to 0 faster, but weighing the higher scores heavier sounded like a great alternative. Based off of my own experience, even scores that aren't in my top 20 don't really give a significant amount of pp, so having them taper off at 40 (if I remember correctly) sounds extremely lenient to me. I don't see why you'd be chasing after 1% scores anyway.
Drezi
Well, we discussed a few of ideas here that have potential imo, I'd be interested in seeing some kind of feedback at this point.
Topic Starter
Tom94

Drezi wrote:

Well, we discussed a few of ideas here that have potential imo, I'd be interested in seeing some kind of feedback at this point.
The proposed max combo scaling would be an improvement over the pp system, but considering the effort it would take to implement it (adding it to the difficulty calculator, storing some kind of approximation of the combo scaling graph in the database etc.) other gamemodes should still be at a higher priority at the moment I think.

From my tests with alternative weighting of scores I still find that the current weight performs best, so there likely won't be a change in that regard.

Judging from other general feedback in here my general plans for standard are to slightly increase the value of small hitcircles, weight fast streams a bit higher compared to spaced streams and improve the accuracy weighting formula to better represent a probabilistic model. I think those changes would improve the current situation commonly perceived as "hardrock needs to be buffed versus doubletime".

I've been occupied with other things than osu! in the last few weeks and I don't know when I will find the time to further tune pp again, but I am still regularily reading the posts in the feedback threads.
ivan
x
Topic Starter
Tom94

Ivan wrote:

how long does it even take to do those kind of things ?
1: Adjust the difficulty algorithm to hopefully fix things (takes thinking and code adjustments - variable from minutes to hours)
2: Calculate new difficulty for all ranked maps (takes a few hours)
3: Calculate new pp for a select amount of players for testing (takes from minutes to hours, depending on how many players)
4: Repeat at 1 if not satisfied with result (usually needs quite a few repetitions to fix / prevent undesired side effects of the changes)
5: Apply the new difficulty algorithm to _all_ maps (takes ~1 day)
6: Push the new difficulty algorithm into the osu! client so that ingame star rating aligns for online star rating (makes everyone recalculate star difficulty in song select, takes some minutes to hours depending on how many maps there are. Might make song select stutter a bit while in progress)
7: Re-calculate pp for every player and hope for as little as possible "I lost 2 pp what is happening OMGOMGOMG" threads (takes ~1 week)
Woobowiz

Tom94 wrote:

"I lost 2 pp what is happening OMGOMGOMG" threads (takes ~1 week)
So does this imply we are to expect a net pp loss overall after the next change?
Oinari-sama

Woobowiz wrote:

Tom94 wrote:

"I lost 2 pp what is happening OMGOMGOMG" threads (takes ~1 week)
So does this imply we are to expect a net pp loss overall after the next change?
Not necessarily, there're usually some people who gains pp while others lose pp for every calculation change. It's just that those who's gained pp after a change usually keeps quiet and grin, while those who's lost pp will go make threads/comments everywhere blaming the system being "stupid" =.=

Do not under-estimate the effort to educate people after an "Armageddon" like that...
uzzi
I feel like ' ~1 week' is a bit of an understatement haha
ivan

Tom94 wrote:

Ivan wrote:

how long does it even take to do those kind of things ?
1: Adjust the difficulty algorithm to hopefully fix things (takes thinking and code adjustments - variable from minutes to hours)
2: Calculate new difficulty for all ranked maps (takes a few hours)
3: Calculate new pp for a select amount of players for testing (takes from minutes to hours, depending on how many players)
4: Repeat at 1 if not satisfied with result (usually needs quite a few repetitions to fix / prevent undesired side effects of the changes)
5: Apply the new difficulty algorithm to _all_ maps (takes ~1 day)
6: Push the new difficulty algorithm into the osu! client so that ingame star rating aligns for online star rating (makes everyone recalculate star difficulty in song select, takes some minutes to hours depending on how many maps there are. Might make song select stutter a bit while in progress)
7: Re-calculate pp for every player and hope for as little as possible "I lost 2 pp what is happening OMGOMGOMG" threads (takes ~1 week)

You could make this happen with no problem. I believe in you my friend
Vuelo Eluko

Tom94 wrote:

Drezi wrote:

Well, we discussed a few of ideas here that have potential imo, I'd be interested in seeing some kind of feedback at this point.
The proposed max combo scaling would be an improvement over the pp system, but considering the effort it would take to implement it (adding it to the difficulty calculator, storing some kind of approximation of the combo scaling graph in the database etc.) other gamemodes should still be at a higher priority at the moment I think.

From my tests with alternative weighting of scores I still find that the current weight performs best, so there likely won't be a change in that regard.

Judging from other general feedback in here my general plans for standard are to slightly increase the value of small hitcircles, weight fast streams a bit higher compared to spaced streams and improve the accuracy weighting formula to better represent a probabilistic model. I think those changes would improve the current situation commonly perceived as "hardrock needs to be buffed versus doubletime".

I've been occupied with other things than osu! in the last few weeks and I don't know when I will find the time to further tune pp again, but I am still regularily reading the posts in the feedback threads.
Jesse top 100 the dream incoming.
Nyxa

Tom94 wrote:

Judging from other general feedback in here my general plans for standard are to slightly increase the value of small hitcircles, weight fast streams a bit higher compared to spaced streams and improve the accuracy weighting formula to better represent a probabilistic model. I think those changes would improve the current situation commonly perceived as "hardrock needs to be buffed versus doubletime".
Christmas came early this year.

EDIT:

I forgot to add this last time, but - about the max combo scaling; wouldn't it be possible to use the performance chart for that as well? It tells you about the HP bar's drain during the play, right? Which means that a section with a lower drain had less performance on. If you have a way of knowing where the most difficult sections in the map are, then it shouldn't be very hard to use the performance chart to determine how well the player did in said section. Maybe this is just a dumb idea because I'm missing something - but hitting more 100s results in a more empty hp bar, right? So it would be lower in that section. I know that this all heavily depends on the drain rate of the map, but there should always be some form of a difference between an SS and non-SS performance in the drain chart. If you combined that with max combo scaling, it might give you an even more accurate idea of the player's individual performance on a map.

Figured I'd throw that out there. Also, it might be nice to hear what kind of feedback there is on the doubles issue (unless that was already given and I missed it)
AJT
Problem Details: I SSed a map with NC and in my top ranks it says 202pp, however I gained absolutely no pp at all. Why is this?

Map link: https://osu.ppy.sh/b/443272?m=0


osu! version: 20140924.1
Woobowiz

akinator127 wrote:

Problem Details: I SSed a map with NC and in my top ranks it says 202pp, however I gained absolutely no pp at all. Why is this?

Map link: https://osu.ppy.sh/b/443272?m=0


osu! version: 20140924.1
The time between your top plays and their accuracies are hella suspicious yo

I'm kidding of course. The only solution is to wait for the pp to come, also that post should really be submitted in Tech Support
silmarilen
it was, and there they said to post it here
Woobowiz

silmarilen wrote:

it was, and there they said to post it here
What, who said that? I think they told him to post in in Gameplay & Rankings, but he posted it in this instead of the general G&R
silmarilen
p/3404843
links directly to this thread
Drezi



Well, you wanted examples of HDHR where we feel it's undervalued, so here, to me this feels kinda wrong, I mean I know my acc on that HDHR play is bad, but still I could FC this song nomod like ages ago, and this same play, same timing of hits would be way higher acc if it was on OD8 not OD10.
GoldenWolf
yeah you lost 4.5% accuracy it's really underrated zzz
Woobowiz

silmarilen wrote:

https://osu.ppy.sh/forum/p/3404843
links directly to this thread
Well they chose the wrong place to redirect to, also not a Mod so he's even less credible

Drezi wrote:




Well, you wanted examples of HDHR where we feel it's undervalued, so here, to me this feels kinda wrong, I mean I know my acc on that HDHR play is bad, but still I could FC this song nomod like ages ago, and this same play, same timing of hits would be way higher acc if it was on OD8 not OD10.
4.3% is MASSIVE, I'd say that should be the right amount of pp.
show more
Please sign in to reply.

New reply