Difficulty System and PP Formula suggestions

Total Posts
Topic Starter
I would post the suggestions in the PP feedback thread, but I think there's too many things to cover and I don't really want to flood the entire thread with a single post, so I think it'd be better to make a thread for this.

The performance point system is admittedly, severely flawed - but there are ways to improve it well enough for it to be very accurate. Under this paragraph are two main things I think that could be improved to help the PP system. I admittedly haven't seen very much of the PP system, nor do I know much about the system at all - but I generally understand the fundamentals of it and how the fundamentals themselves could be improved. I'm not expecting these to be implemented or even read through, but it's probably better to give thoughts than to hide them and have zero chance for the PP system to improve.

Difficulty System
A little disclaimer, the examples I bring up and the terminology used only pertains to 4-key play, as it is the main mode I play and am most familiar with. However, the fundamentals of the difficulty system will apply to other keymodes as well, aside from 3K and under.

A huge factor of the PP distribution comes from the difficulty system, and is probably the main thing that should be fixed in order for the PP system to be fully useful. The main issues I've noticed regarding the difficulty system would be its overemphasis on maximum NPS and chords in general. AiAe is rated as the hardest 4k map by far, despite only being the second hardest ranked map overall based on personal experience and leaderboards. Compare it to Imperishable Night 2006, 1.52 stars (5.24 vs 6.76) LESS than AiAe, despite being harder overall. Having both of them at similar difficulty levels, if not slightly lower than IN2006 would be fine, but let's break down the factors of what makes a good difficulty system:

Speed is essentially split up into two different components: One hand trilling/gallops and jacking. The jacks in speed files are generally very brief, but the difficulty of jacks in streams go up far more exponentially than they do with one hand trills. That's why IN2006 is so hard, it's littered with tons of jacks in ultra-fast runningmen and with occasional one hand trills to boot.

However, fast bursts and speed in general are relatively low in NPS. A 450 BPM stream would be no faster than a 225 BPM jumptrill, or a 112.5 BPM quadjack. It's clearly obvious that the former is harder than the other two, and while the difficulty system does try to show a difference in difficulty between those, the differences are not major enough to differentiate between all of them accurately.

You can fix this by having a significantly lower weightage on max NPS and more weightage on patterning. One hand trills on any hand would increase the difficulty of a stream, even with it's just the minimum (3 notes) - which is why 1234321234 is harder than 123412341234. Jacks ramp up the difficulty as well, but it goes up exponentially IF the jacks that you're dealing with longer than 4-5 notes and you also have to hit other notes in a stream on the same hand. An example would be a runningmen, 43424342. I can't give numbers offhand on how much should a 300 BPM 10-note runningman (5-note 150 BPM jack in a stream) or a 260 BPM 6-note one hand trill should increase a difficulty of a map - but I'd say that a 300 BPM 12-note runningman is about as hard as hard as a 265-270 BPM one hand trill, so hidden jacks in runningmen should increase difficulty around 75%-80% more than a one hand trill with the same length. The longer the one hand trills/jacks in runningmen, the higher should be, of course.

As for hitting jacks while hitting other notes in a stream on the opposite hand, that belongs in the jacking category, which is what I'm going to talk about next:

Due to the current difficulty system, chordsmash files are insanely overrated. Chordsmash files are especially overrated when it comes to fast walls, like AiAe's 180 BPM jumpjacks/360 BPM jumptrill - due to its emphasis on maximum NPS. There's too little emphasis on single jacking and jumpjacks that aren't necessarily overly fast, but changes columns often. How much difficulty should a single jack should be heavily determined by the speed and its length. An additional factor for jump/handjacks would be how often do the chords change direction.

For minijacks, the difficulty curve would be anti-exponential up to a point (like 350-380 BPM?), there isn't much of a difference between a 250 BPM minijack and a 300 BPM minijack, unless there's a bunch of them in one time, in that case the more one hand bias (e.g. 1122 or 3344) there is in the run of minijacks, the higher the difficulty should be. Gallop difficulty curve should be less exponential as one hand trills, but not considerably. The difficulty should also be higher when minijacks going into a hand or quad, as they are harder to transition into - especially with the lack of chord cohesion.

For jacks longer than 2 notes, the difficulty curves should be exponential - more exponential as the length of notes get longer up to a point (say 16 notes), the difficulty curve for 8-note jacks should start exploding by 155-160 BPM, but not as fast as it currently is at at the moment. When it comes to runningmen that has the jack on one hand and the rest of the runningman on the other hand, it should be rated higher than a single jack of course, but not by very much.

Regarding jumpjacks and handjacks, difficulty should be increased further if the direction of the chords change often. However, for jumpjacks, there should still be minijacks hidden inside the jumpgluts in order for the difficulty increase to count. In addition, jumpjacks that involve two hands (i.e. not [12] or [34] in a 4K context) should be rated higher than ones that involve one hand, but that's self-explanatory.

This applies more towards 7K+ play, since there isn't too much of a difference in finger strength between the index and middle fingers due to human anatomy. Patterns that are more biased towards the outer fingers (e.g. the ring finger) and the thumb should be rated higher, especially one hand trilling and runningmen. The ring finger, pinky finger and thumbs are objectively worse for rhythm gaming, due to their lack of flexibility and strength. I heard that this will be implemented in the following difficulty system revamp however, so disregard this part if it is going to be implemented.

I didn't talk about noodles/long notes here, because I don't have that much experience or understanding on those to really give any constructive feedback on how it contributes to the difficulty system.

Ultimately, 5K+ maps will still be rated higher than 4K maps, even if both maps require the same amount of effort. This is because you have more fingers to deal with a general higher NPS, with more factors included in 5K+ play. This is inevitable, it's impossible to really encompass a completely fair difficulty system that covers every keymode. Because of this, and this will more than likely be a controversial opinion, will give a very clear bias towards 5K and up players. I'm not saying that 4K players are inherently better than 5K+ players by any means, but I feel that 5K+ players are rewarded too much for what they're actually doing. They already have an advantage over 4K players mainly because they could just use 1-3 less fingers to plow through 4K maps. You can argue that 4K and 6K+ have different emphasis in skills (5K is very similar to 4K), but if you're a top-tier 7K+ player, you should be able to plow through a good number of hard 4K maps. The reverse isn't true by any means, otherwise you'd see players like Staiain getting 96s and 97s in IN2006 7K Lunatic within a matter of weeks after destroying 4K. The fault is not in the difficulty system though, it's in the PP formula itself.

Performance Point Formula
The easiest way of reducing bias towards 5K+ players would be to reduce the amount of PP given for a map with more keys. It doesn't have to be much, the PP given doesn't even have to be the same as 4K players. It could be a general multiplier, or a exponential (albeit a very very small exponential curve) multiplier to adjust the amount of PP from a map with more keys accordingly. It would take some effort however, because you would have to check what would a 4K map be rated compared to a 5K/6K/7K/8K map if they have the same difficulty. With a perfect (or at least near-optimal) difficulty system however, an experienced o!m player (i.e. very good at every key mode) should be able to compare the difficulty between modes with relative ease. You could potentially create a multiplier from there.

How would this affect ranks overall? In order to be a player high up on the leaderboards, you would have to be proficient with every key mode. Even though you're really really good at 5K or 6K, if you're not good at 4K, you would not be high up on the leaderboards. 4K players would have to learn other key modes ultimately and it will take a longer process, but their ranks would not be understated as they were from before if this was added.

So that's pretty much it. There isn't much else I could talk about, since the PP formula and the difficulty system are the bread and butter of osu!mania's competitive gameplay in the grand scheme of things.

Would like to hear thoughts on this, counterpoints to these suggestions and improvements would definitely be appreciated.
This is very interesting. I don't agree with "scaling pp for other keymodes" though. 7K will be biased anyway because it hasmuch more hard stuff ranked and good system won't have to scale anything to get accurate ratings. Big props for all the other stuff, I hope pp guys will read that many times
Topic Starter

-Kamikaze- wrote:

This is very interesting. I don't agree with "scaling pp for other keymodes" though. 7K will be biased anyway because it hasmuch more hard stuff ranked and good system won't have to scale anything to get accurate ratings. Big props for all the other stuff, I hope pp guys will read that many times
Regarding the difficulty system: Of course, a perfect difficulty system would not have to scale for other keymodes - but that'd require tons of effort and while it is possible (completely different numbers for different keymodes, different factors involved), I can't quite imagine it happening any time soon. I guess you can see this as a way to fix the difficulty system as it is at the moment and an additional step to reduce bias towards higher keymodes, before revamping it to a difficulty system that is able to encompass all key modes accurately.

Anyway, you're right when you said that the 7K mappool has more hard maps than 4K (and other modes), and it will be biased because of that. Regardless though, in an ideal scenario where each mappool for each key mode has the same amount of hard maps, I do think it'll be completely unfair if the system gives significantly more PP to a player who plays 7K than one that plays 4K although both have the same score and both files require around the same amount of effort, because the difficulty system will on average higher-key maps higher than lower-key maps.

Thanks for the comment, much appreciated
This is awesome. Talked about this in another game.
The problem is, formulating a difficulty analyzer if some sort would seem to be very difficult to make. The expounded patterns, LNs, and even SVs add up to a complex mix.
If you could give a game that has an accurate difficulty system (system/machine generated and not placed by people) i will stand corrected an also be surprised.
The 7k issue has been stated too.
Other than that there seems to be no problem with this.
In my opinion, any sort of "finger placement dependent" criteria should be avoided. Many plays differently, whether it be with 2 thumbs or 2 pinkies or even 1 hand and 2 hands.

And while were at it, I doubt any kind of pattern sensitive algorithm would be able to satisfy the majority and overcome subjective difficulty. The best way to go about this, which will also be much easier to do, would be a dynamic system. A system that takes into account the performances and the general skill level of the people that plays it and uses it to determine the map's level.

To avoid data falsification we could say only players that has filled out their top plays slots, players with x-amount of plays or players which their ranking has stabilized are taken into consideration. Or really any filtering you see fit.

If no one in the top 100 can even S the map, the system will assign it an extremely high rating. If people from not even top 20k full combo it, it will receive an extremely low rating. Sure you'll get the random outliers that are better or worse at the patterns featured in the map, but the fact of the matter is the dynamic system will place this map at it's best possible place through data regression and averaging. This will overcome most of the subjective issues, the 7k/4k weighing and literally any other problems that come with deciding a difficulty for a map. The reason for this is it's based on real performance statistics and not subjective difficulty criteria.

The system was already put in place for LR2 as a third party project and really it works quite amazingly.
http://walkure.net/hakkyou/komakai.html FAQ and algorithm
http://walkure.net/hakkyou/bms.html The re-ranked charts based off clear type
Considering the amount of players this should pull off well.
The stuff stated below won't be a possibility or o!m but I just want to state an example:
In the game where I came from (FtB) this was somehow practiced. In the early days most 1000 npm maps would recieve a 30/30 diff rating but as a few years passed by 1800 npm+ got the 30/30 rating and newer 1000 npms get 22-26/30 due to the increasing number of skilled players. (Patterns are taken into consideration, not just npm) Diff are votes by community so inactive voters dont get to revote.

However, in o!m, we have a very good spread of players from jhlee to n>1 star players and are also of good amount.
I don't see how this could fail. It's not like everyone will become an ET someday and that no beginners will remain.

This might also get to provide cleaner ratings (No diff will not exceed 10 stars where 10 stars is like the point of inpossibility or maybe 10.1 lol) *cough Fantazindy SHD
And yeah, since the diff will be demograhpically projected there should be no revoting problems.

All that is that needed is that new players should keep coming (not too frequently) because people improve in this kind of game. And if no new players arrive the average skill of all players increase. I've seen this happen to the point that (stating from previous example) N diffs will be like H diffs.
show more
Please sign in to reply.

New reply