PP!Balance: A proposal for a new PP system

Mio Winter

Joined April 2016

Topic Starter

Mio Winter 2017-11-23T23:11:34+00:00

Note: I've thought about this idea for a while, and now with the recent touchscreen debacle I thought it was time to make a post about it. This isn't actually a solution to the TS issue, but it is relevant because it sparked a lot of debate about ways to change the current PP system. Read it on the new osu! website for better formatting.

What I see most people complain about with regards to the current PP system is how it fails to reward some skills (e.g. tech reading) and rewards other skills unfairly (e.g. aim). All these concerns can be summed up in one word: players want balance between skills. This here is a proposal that tries to do that automatically, with each skill rewarded proportional to the difficulty of attaining them.

I think this way of thinking about PP is the right way to go, but I can't guarantee that my preliminary version doesn't have flaws. My hope is just that if the osu!devs think it's worth the time, this system is tested on the database of scores that already exists, to see if it looks good and to check for weird cases. I can't do that myself because I don't have access to the database (except the top 100 plays on each map via the osu!api). Also, I don't want to pretend I have the definite solution to anything; Peppy and others will have much more experience thinking about this issue.

TL;DR: lol no, I'm not giving you a quick way out of this. Read the entire mind-bogglingly long thing. Also, there will be maths. :p

Background

This here is approximately the history of osu! PP systems.

Player A: Yo. I'm fantastic at aim. Can you give me some PP to recognise my achievements?

Peppy: Sure, let me change the algorithm to award PP based on aim.

Player B: Hey! I'm at least 10 times better at accuracy than that player is at aim, yet she has more PP than me. That's unfair!

Peppy: It's fine. Let me just tweak the algorithm to reward accuracy more.

-GN: *ahem* so what I can do is... well, it's kinda hard to explain. I'm better at what can only be called weird stuff than most other players, yet I don't get the amount of PP proportional to how difficult it is to be me. Here, have a look.

Peppy: what is that I don't even... ok, fine. I'll just tell the system to give you PP based on... how the heck do I tell the computer exactly what you're good at? Number of "wub"-sounds in a map? Complexity of sliders? Weight low-AR plays more? ...!? This seems at least NP-hard. Have a neat badge instead.

The complaints all seem to be some form of "the PP system does not weight X enough", where X can be skills like consistency, accuracy, sliders, tech reading, low-AR reading, etc.

And the proposed solutions to these complaints all seem to be some form of "weight skill X more". What I'm saying is that these kinds of solutions will never practically be sufficient, because we don't know exactly what kinds of skills being good at osu! requires, and we definitely don't know how to tell a computer what those skills are so that it can award PP based on them. So these types of solutions all seem to be a form of the typical error of applying specific solutions to a problem we can't specify.

The way to fix problems we can't specify is to offload the work of specifying the problem unto the solution itself. Yeah, uh, that was a confusing abstract garble of a sentence and I should start talking specifics.

The core idea

PP!Balance is a twofold system, just like PPv2. First, the PP value of a play is calculated, then the PP from that play is added to your total PP with the same weightage system that PPv2 uses.

Here's the weightage system explained from the wiki entry (you can skip this if you already know).

osu!wiki on the PP weighting System

Performance points use a weighted system, which means that your highest score ever will give 100% of its total pp, and every score you make after that will give gradually less.

This is explored in depth in the weightage system section of the article above. To explain this with a simpler example:

If your top pp rankings has only two maps played, all of which are 100pp each scores, your total pp would then be 195pp.

The first score is worth 100% of its total pp as it is your top score.

The second score is worth only 95% of its total pp as it is not your top score, so it contributes only 95pp towards your total instead of 100.

Now, let us posit that you set a brand new 110pp score. Your top rankings now look like this:

110pp, weighted 100% = 110
100pp, weighted 95% = 95
100pp, weighted 90% = 90

As you may have figured out, your new total pp is not simply 195 + 110 = 305pp, but instead 110 + 95 + 90 = 295pp.

This means that as you gradually improve at osu!, your pp totals will trend upwards, making your older scores worth progressively less compared to the newer, more difficult scores that you are updating them with.

The weightage system is necessary so that players can't gain more PP simply by setting more and more scores. If there were no weightage system in place, a player's PP would just be a measure of how many scores a player has set, not a measure of skill.

And how are the PP values for each play calculated?

The idea is that the PP of a play is supposed to measure how much better than others you are at that map. Call this the play's "Superiority". There are several ways to measure superiority of a play (e.g. rank achieved on map, percent score of top score, difference between the score and avg score and so on), and I talk about which way I think works best in the next section.

The essential difference between PP!Balance and PPv2 is that, unlike PPv2, PP!Balance doesn't try to directly define what it is exactly that is difficult in a beatmap (e.g. aim, accuracy, speed, reading, finger control, stamina and so on). Instead it lets the difficulty of a map be determined empirically by players' ability to play it.

If the hardest beatmap in the game is actually a lot more difficult than its low 5* rating suggests, you should get more PP for FCing it than for other maps of the same star rating. In fact, the ezpp! plugin tells me that you get 179pp and 201pp for SSing each of the maps respectively on the current PP system, so it does even worse worse than equal.

The PP system I'm proposing would give much more PP for FCing Scarlet Rose than the Veela map because FCing Scarlet Rose is very much better than most other players can do on the map. FCing the other map is still better than most, but it isn't as much better than an FC on Scarlet Rose is, so you won't get as much PP for it.

This also means that you can gain PP by being good at whatever you want, as long as there are ranked beatmaps for it. If you're not much better than others at aim, but you're still much better than others at tech reading, then you'll have an easier time gaining PP by playing tech maps than aim maps.

The players with the highest PP on this ranking system would then tend to be the players who are most better than others, which sounds sort of correct. In PPv2, the highest ranked players tend to be those who are best at a combination of aim, speed and accuracy, mostly ignoring other skills required to do well on beatmaps in osu!.

Here be maths

I will now proceed to cobble together the formula I think works best for calculating PP. I'm not sure this is the best formula possible. But instead of giving you my best suggestion first, I'm going to take you through some steps I personally went through in order to arrive at my best suggestion. I want to show you that there are options here, and I need to discuss them in order to explain why I think my latest version works best.

If you don't understand the technical details, you can still understand the system in broad strokes from the previous section and then give constructive feedback on it. No feedback is stupid (but pls don't yell at me).

So without further ado,

First idea: Rank-based superiority
My first thought was just, why not use the play's rank on the beatmap? If you get first place you get more PP than if you get 2nd place, and so on. We note that the difference in difficulty between the 1st and 2nd rank usually tends to be much greater than the difference in difficulty between the 101st and 102nd rank, so we conclude that there is something like a logarithmic distribution in the difficulty of attaining map ranks. Therefore the PP value of the scores should also scale logarithmically. We end up with something like this. (edit: Was PPv1 something similar? osu!wiki doesn't say.)

In the first term, N is the number of other plays made on that map (including your own plays and failed plays) set within 365 days past. The more plays-within-a-year a beatmap has, the more PP you get for getting a high score on it. The time limit is necessary to not unfairly disadvantage new maps that are just as hard as old maps but have fewer plays on them.

The reason why the PP of a play should depend on N is that the more people try a beatmap, the higher the competition is for getting high ranks on it. It is more difficult to win competitions where lots of other players compete, so you deserve more PP for ranking highly on them.

k is a constant weight that determines the importance of N. For example, say that k = 10,000. Then if you set a score on a beatmap with a 200,000 within-year play count, you multiply the play's PP value by 20. Or if the within-year play count was only 5,000, then you multiply the play's PP value by 0.5. It could also be that a different scaling of N works better.

The second term (1/ln(stuff)) is supposed to scale inversely logarithmically with your rank on a beatmap. Calculating for a few values of MapRank, we see that the PP value of 1st place gets multiplied by 1, 2nd place by 0.76, 101st by 0.2159, 102nd by 0.2154, and so on. It checks out.

(e-1 is there in the logarithm to make sure that the term equals 1 for MapRank = 1. If I instead wrote 1/ln(MapRank) without the e-1, then the term would equal 1/ln(1) when MapRank = 1. And since ln(1) = 0, you would have to divide by zero, and that is bad.)

The reason why I think this way of calculating PP is a bad idea is because on some maps, you see many players share the same score at the top, and it would be unfair to give different amounts of PP to players for exactly the same performance. And the corollary is that on other maps, the first ranks are arguably a lot better than any of the other scores, and this system isn't sensitive to degrees of superiority like that.

Second idea: ScoreV1-based superiority

Take your score and divide it by the average score on that beatmap. If your score is exactly average, then PP_play = 1. If your score is 100 times greater than average, PP_play = 100. If your score is exactly half the average score, PP_play = 0.5. This system can then tell the difference between a score which is barely 1st place and a score which is way above what's needed for 1st place, and give PP accordingly. The idea assumes that average score on a beatmap is a good proxy for how difficult it is.

We can easily do some magic to make PP values range from 1 if the score is 100 times lower than average and 1000 if the score is 100 times greater than average. But the core idea is easier to represent with the simple equation above. We'll make the PP values have sensible ranges later.

It's probably wisest to take the average of all the top scores set by all players who have played the map (so if you play the map twice, only the top score gets counted when calculating the average). This is to prevent people from being able to purposefully set lots of bad scores on a map in order to drag down the average, making their top play worth more PP.

The main reason I think this system is insufficient is because it inherits the purposes and flaws of the current scoring system. A lot of players complain that accuracy has little effect on score, and indeed that would mean that accuracy has little effect on PP values in this system.

We have a few options here. The first option is to record all plays players make on maps, calculate what their ScoreV2 scores would be (even if they played using ScoreV1, we just recalculate using ScoreV2), and then do the PP!Balance calculations using those scores. Or we can invent a new scoring system and do the same with that. I prefer using ScoreV2, but you're free to argue why using something else is better.

Third idea: ScoreV2-based superiority

If a beatmap uses ScoreV1 for its leaderboards, then your top score on that leaderboard may not necessarily be the top PP play, because ScoreV1 and ScoreV2 is calculated differently. This is actually a problem with the current PP system, where you can make plays with a higher score but lower PP value, causing you to lose PP. If beatmap leaderboards will continue to stay ScoreV1, then I think the solution to this is just to keep two leaderboards for each beatmap, one for each scoring method, and then just make the ScoreV2 leaderboard hidden. That way, you can make plays that have a higher ScoreV1 value than ScoreV2 value without losing PP.

For the purposes of the PP calculations, the average score on the beatmap would then be the average of the hidden ScoreV2 leaderboard. If a player has played the beatmap twice, only their top ScoreV2 play gets counted into the average.

A problem with this way of doing it is that different maps have different HP values. For example, Airman has an HP of 3. This means that it will have a lot more bad scores than if the HP value was 7, dragging the average down considerably compared to maps with an HP value of 7. And since the average goes down, all the plays' PP values go up, making it easier to gain PP from maps with low HP value. And that's bad, because maps with lower HP values aren't necessarily harder than maps with higher HP values.

The solution seems to have to be some way of dealing with failed scores: the more failed scores a beatmap has, the more PP should be given. One way of dealing with this could be to just submit failed scores (i.e. their ScoreV2 values up until the point they failed) into a hidden database for that beatmap, and then include those scores when calculating the average score on that beatmap.

But this gives too much of an advantage to maps with high HP values. For example, I bet I'm not the only noob who has tried to play DoKito's Yomi Yori (HP = 6) several times only to fail horribly at the first stream. A large number of those plays would have gotten a much higher score if they were played on an identical beatmap with HP 4. Thus the HP 6 version would have had a lower average score if we add the failed scores.

So I think we need a more creative solution to the HP bar problem. EDIT 11/02/2018: The solution to this problem is really easy. (Thanks to Omnipotence -.) Just calculate the average score on a beatmap from the subset of the scores that would have passed if the beatmap had an HP value of 10. This way, the PP system treats all maps as if they had HP 10, so the actual HP value of a map won't affect the PP possible to gain from the map.

Another issue with this third idea is that some maps are popular for almost exclusively very new players, and other maps are popular for almost exclusively very good players. It will therefore be easier to do better than others on the former type of maps than on the latter type of maps.

Let's call this problem "differential popularity" (better players tend to play harder maps, less good players tend to play easier maps) so that it's easier to talk about. This is actually also a problem in the first and second ideas, and it makes it hard to reliably measure the real difficulty of beatmaps. (Edit: I wrote a section on solving this problem later in this post.)

Having explained why I think this is the way we should define PP values, we now turn to finding a way to make this formula have the proper range of values.

Defining the range
I think the difference in PP values of the worst play to the current best play should be something like a 1000 PP. A bit less or a bit more is fine. This is approximately the range of the current PP system. Maybe the new system should have slightly higher values because then the transition won't make many players angry about "losing" PP?

Anyway, here's the formula I think has the best scaling. I won't go into how I arrived at this equation, but if you need to know you can ask me.

Where we define the S variable ("Superiority") with the formula from the previous section.

In other words, S is the ratio of the ScoreV2 value of your play and the average ScoreV2 values of all the plays on the dedicated ScoreV2 leaderboard (see the first paragraph in the previous section).

PPfloor is the lowest value of S we want to give PP to. We give 0 PP to all scores below it. I think plays that have an S value of less than 0.01 can safely be given 0 PP to, because I think even the newest players can find maps they can achieve an S of above 0.01 on (but if I'm wrong, we can set PPfloor even lower). Consider, in order to achieve an S of lower than 0.01, you must score (ScoreV2) something like 5,000 on maps that have an average score of 500,000. So PPfloor = 0.01 looks good to me.

k_1 and k_2 are constants that determine the range of PP values. If we set k_1 = 2.5 and k_2 = 100 then the range for different values of S looks like this.

To achieve 1256 PP, you need to score a thousand times better than average. For example by scoring 1,000,000 (nomod SS) on a map with average score of 1,000. You can fiddle with the constants in this Google spreadsheet to see what they do (anyone with link can edit).

I'm not sure what the best weighting is. I think the way to find out is to set the weights to something and then use the formula to calculate a bunch of PP values from plays that have already been set, and see if it the results look about right.

Also, I think it'd make sense to not give plays on a map PP before at least a thousand players have set scores on the map. Otherwise the PP values of the plays would be too dependent on random initial conditions.

Formula explained with an example

Imagine I set the number one score on xi - FREEDOM DiVE [FOUR DIMENSIONS] (normal score for me tbh). A HDHR SS worth 1,120,000 points on ScoreV2. Let's also say that the average ScoreV2 on that beatmap is 100,000 (I have no idea). So that gives us an S of 11.2. Assume we use the same weighting of the constants as I gave them in the section on "Defining the range".

The amount of PP I would get would then be...

Exploitability: PP farmers vs. PP hunter-gatherers

In any PP system that tries to directly define what "difficulty" is (e.g. by defining it as a function of aim, speed, accuracy and strain like PPv2 does), mappers will eventually be able to find out which kind of maps are easier to play while maximising PP value. This opens up a trend in which new PP records are being broken because mappers get better at making PP maps, rather than because players are actually getting better. Just to be clear, players are getting better, but at least some (read: a substantial amount) of the reason players have been gaining more PP is because new PP maps are being created.

(Having over 8k PP was a lot more impressive with the maps that were available 3 years ago.)

But I don't really think it's fair to blame the mappers. If blame should go anywhere, it should go to the PP system itself, because that's probably the easiest thing to change in order to fix the problem.

One such exploit I can foresee in PP!Balance is that maps that require memorisation will be easier to gain PP on than other maps. Why is this?

In osu!, there are some maps can't be sightread well and require memorisation to score highly on (for example slider-heavy maps with varying slider velocity, because there is currently no way to read slider velocity before actually clicking the slider (should be fixed imo!)). These maps will have a lot of bad scores on them, and many of these scores will not be because the players lacked skill, but rather because they didn't bother to memorise how to play it.

Also, on PP!Balance, a play's PP value depends on the ratio between your score (using ScoreV2) and the average score.

Now imagine you try to sightread a memorisation map as described above. At first, your score will of course be bad because you couldn't read it. You're in the same position as many other players who played the map: their scores on the map are bad because they didn't bother to memorise it, so the average score on the map will be low. But for you, this is an opportunity! You can decide to memorise the map and therefore do much better than those who didn't. Let's say your original score was 0.1 of average, and your score after memorising the map is 10 times average.

The point is that it's much easier to go from 0.1 to 10 times the average by memorising rather than by getting better aim, speed, accuracy, finger control or other skills. The only way to go from 0.1 to 10 on a pure jump map that everyone can sightread perfectly is to get better at aim, and that's a lot harder than simply memorising a map.

To be honest, I don't think this exploit is as bad as the exploitability of the current PP system. Giving a disproportionate amount of PP for memorisation doesn't sound like too much of a problem, when everyone has the same ability to exploit it. But even so, there is a reason why this exploit may be less exploitable than you think.

And that reason is that on PP!Balance, every exploit of the system will become less exploitable the more it is exploited. For example, on that slider-heavy memorisation map I linked to above, if it turns out that it is easy to exploit it for PP simply by memorising it, more players will be motivated to memorise it. The more PP a map gives, the more players will be motivated to memorise it, causing the average score to go up. And the higher up the average goes, the less PP will be granted for high scores on it.

This leads to a really cool dynamic of players being forced to look for new maps to exploit as soon as the PP from the first map has been depleted. Instead of PP farmers we get PP hunter-gatherers, living a nomadic lifestyle always on their feet to find new lucrative caches of PP. This becomes a general force to equalise the PP territory, such that any map that is disproportionately easy to gain PP from will become proportionately easy to gain PP from.

And that's cool.

Edit: Solving the differential popularity problem

My original post sort of just mentioned the existence of differential popularity and said "let's hope it works anyway despite this glaring problem here!" I'm suitably embarrassed by this now, and I should have made a bigger effort to solve it before I posted this. Anyway, here's an update with what I think may be a solution to the problem. But first, I describe the problem in clearer terms so that my suggested solution will be easier to understand.

PP!Balance is an attempt at constructing a statistical method for measuring difficulty and skill (btw, Full Tablet has a really interesting example of a statistical measure in this thread). This is in contrast with direct methods of measuring difficulty and skill (PPv2 is an example).

The essential problem to creating a statistical measure is this. You have two sets of things, players and beatmaps. You want to measure the skill of the players, and the difficulty of the beatmaps. "Skill" is defined as the player's ability to score high on beatmaps, and "difficulty" is defined as the beatmap's ability to make the player score poorly on it. If we had a reliable measure of the difficulty of all beatmaps, then using that to determine the players' skill levels would be easy; and if we had a measure for the skill level of all players, using that to determine the difficulties of beatmaps would be easy. But since we start by knowing neither of the two, we have to find some trick to measure both at the same time.

One way of doing this would be to take a population of players with a constant average skill level, and then make them play all the beatmaps. The average scores that the population sets on each of the beatmaps would then be a pretty good measure of the difficulties of the beatmaps. If a population plays two different beatmaps, and the population's skill level remains constant, then a lower average score on one of the beatmaps means that that beatmap is harder (except for beatmaps where memorisation is important, as discussed earlier).

Unfortunately, no single population has played all the beatmaps. (Except maybe Toy and Blue Dragon.) So we're not guaranteed that the average scores on each of the beatmaps were set by the same average skill level. In fact, it's likely that the average scores on easier beatmaps were set by players with lower average skill level than the average scores on harder beatmaps. This is because players tend to seek out beatmaps with an appropriate difficulty level for them. We called this the "differential popularity problem".

So here's where my idea comes in.

Transitive Player Overlap Comparison
Definitions
M1, M2, M3 and so on are beatmaps.
P_M1 is the population of all players who have set a score on map M1.
P_M1 ∩ P_M2 is the population of all players who have set a score on map M1 AND map M2. We call this the "Alpha" population.

Algorithm
We first pick a map, M1, and then call the population that has set scores on that map the "Alpha population" so that it's easy to talk about. We define the difficulty of that map as the average score the Alpha population has set on it.

To determine the difficulties of other beatmaps, we estimate what the Alpha population would have scored on them if they had played those maps. But how will we estimate what the Alpha population would have scored on a beatmap, Mx, that Alpha hasn't actually played?

First, we find a beatmap, Mx, that has a population of at least 10,000 players (higher? lower?) who have set scores on Mx AND M1 within a timeframe of a week (higher? lower?) of each other. Actually, it doesn't have to be M1, it can just be any map we previously have an estimate for. Let's call that beatmap Ma, and the population that has played both Ma and Mx the "Beta population". In other words, we find a beatmap, Mx, for which the sentence "P_Ma ∩ P_Mx > 10,000" is true. (I discuss some details on this population search in the next section.)

Once we have found the Beta population, we compare the average scores that the Alpha and Beta populations have set on Ma.

If AlphaAvg(Ma)/BetaAvg(Ma) is greater than 1, then we know that the Alpha population is better than the Beta population. That means that we need to adjust BetaAvg(Mx) upwards in order to estimate AlphaAvg(Mx). In other words, we get the following formula.

Repeat until we have an estimated AlphaAvg for all beatmaps, and then we use those averages when we measure the PP values of plays as described under the "Defining the range" section.

Also note that it's not necessary to estimate the AlphaAvg more than once per beatmap, so this becomes a static measure of difficulty. Thus PP values will not fluctuate as people set new scores on each beatmap, they'll stay the same.

Details on the population search

The limited timeframe is there to ensure that the population's average skill level is the same when they're playing the first beatmap as when they're playing the second beatmap. For example, say we want to measure the relative difficulty of a 3* beatmap and a 6* beatmap. If we didn't specify a timeframe, it could be that the same population played the 3* beatmap an average of 1 year earlier than they played the 6* beatmap. We can expect players to get better over time, so their average skill would be higher when they played the 6* beatmap than when they played the 3* beatmap, and our attempt to measure the relative difficulty would fail.
We can also require that the all the players in the population has played both beatmaps at least 10 times each (or more). This will lessen score differences due to the memorisation problem (discussed earlier), because the players have had time to memorise both maps. Further score differences will then be due to differences in how much of other skills (aim, speed, sliders, reading, etc.) the maps require, and not due to memorisation.
This system will work badly for maps that are popular with very new players. Because even if we limit the population search to players who have played both beatmaps within a day of each other, very new players can improve substantially within a day, so the population's average skill level may be different while playing the two maps. To prevent this, we can limit the population search to players who have a playtime of at least 100 hours or something. Players who have played a lot already see less improvement per hour than very new players.
Instead of searching for the first beatmap with the criteria listed above, the population search could rather search for a beatmap with the highest combined measure of 1) largest common population between the two beatmaps, 2) shortest timeframe between players playing the two beatmaps, 3) highest player playtime, and 4) highest number of times that each player has played both beatmaps.

Testing period before full transition

I mentioned in the introduction that I'm not sure exactly how this system will pan out if implemented and I don't have a way to check. If by magic coincidence, Cookiezi's FC on Image Material (honourable mention, La Valse also FCd it!) is worth less than any of my own scores ever, something has gone horribly wrong with the system and we need to figure out what before we actually implement it.

But after having tweaked the system to make it look about right for the scores currently in the database, I still think it's wise to have a public "testing period" to see how players react and if they can find bugs that no one thought of in the development period.

One way of doing this could be to show the PP!Balance value of plays and players on their profiles. This could be under the PPv2 values, inside parentheses and with a note that links to a post where they can get information and give feedback.

If the players dislike the system more than they like it, well, then that's a pretty good thing to know before it was implemented proper. No reason to make updates that make people net unhappy.

It's worth mentioning that people already seem to be pretty miffed with PPv2. The question is whether they'll be more miffed with this system. If you think about it... Most of the playerbase are not rank 1, so they are dissatisfied with their rank. Therefore, most of the playerbase will be happy about a change in the PP system, see it as an opportunity and think "maybe now I can become rank 1 faster?" Maybe?

Questions and Answers

But Mio, this will completely ruin the current PP farming mapping meta!

This is a FEATURE. Not bug. Definitely not a bug.

What about FL, HT and EZ? Won't they be overvalued?

My first response is you can't even EZHTFL and you call yourself an osu!player? My second response is, actually, these mods may be overvalued on this system if they are overvalued on ScoreV2. I'm not an expert on any of these mods, so I don't have a good feel for how difficult they are compared to nomod.

If all the top PP plays in this system turn out to be EZ, HT and/or FL scores, that probably means that the system is imbalanced. Most plays are not EZ, HT or FL, so that makes it suspicious that the most difficult play will be with one of these mods.

A potential PP treasure throve for EZ players are maps like Yomi Yori. The average ScoreV2 of that map will be pretty low since it's such a difficult map, so getting a good score on it will be worth a lot of PP. But getting a good score on it will be easier with EZ or HT, so is their PP value balanced?

I don't know. Is idke's FC on HT more impressive than firebat92's nomod score (currently one rank below)? I suspect more good players have played that map on nomod than on HT, yet we find more HT scores in the top 50 than nomod scores. This is probably because using HT makes it less difficult to set high scores than on nomod. Same reasoning for EZ. I'm assuming that the leaderboards look similar on ScoreV2. Time Freeze is another good example.

(Also, yes, I do realise that EZ and HT makes maps much harder to read, so they are really impressive in their own way. But I stress that I'm really not an expert on any of this, so if I say stuff that is inexcusably wrong, please correct me.)

I don't know if FL is balanced either. rrtyui's FL score on Neuronecia is worth approximately as much as other HDHR scores on it. Should the FL score be worth more or less? This all warrants more testing than I can do.

A possible solution to this is to calculate the PP for EZ, HT and/or FL scores against an average score that only counts scores made with the same mods. But because most maps will not have at least a thousand players who have played with these mods on, EZ/HT/FL players will not be able to gain PP from most maps. This seems bad, so I wouldn't advocate this as a solution unless testing seems to indicate that they are overvalued.

(While we're on the topic of balancing mods: The score modifier for HD ought obviously to scale inversely with AR.)

You need to balance the amount of balance in your PP!Balance, otherwise we'll have too much balance!

I can imagine the complaint,

Hypothetical Player: "Look, I actually just want the PP system to balance the skills I like. No silly stuff! Can you even imagine what crazy people like Ekoro, Riviclia, Mafham and their gang will become good at next?"

With exception to the EZ, FL and HT mods (discussed in the previous section), I don't think this will be a problem. The beauty of this system is that when mappers create new creative maps that are difficult in some hitertho unknown way, then players can try to gain PP by becoming better at those kinds of maps.

"Yeah, he's just good at s-such weird maps."
(OWC commentator on -GN carrying his team on Graces of Heaven)

This system fails to balance accuracy

The argument goes: in (the current version of) ScoreV2, 30 % of your score is determined by accuracy. Some players may think this is too little, other players may think this is too much. Since PP!Balance uses ScoreV2 for its calculations, the complaints about accuracy in ScoreV2 also apply to PP!Balance.

I think this is true, but only partially. Players who are best at accuracy will still be able to get a lot of PP by playing maps where the average score is lower because they are maps that are especially hard to acc. If your strength is accuracy, you'll have an easier time doing better than others on these maps, thus those maps (instead of, say, aim maps) will be "farm maps" for you. (I'm sorry, I'm not sure which maps those actually are so I can't give any examples.)

ScoreV2 isn't sensitive to the length of beatmaps, so this system overvalues short maps

Not really. It is true that short maps gives more room for luck to determine the outcome of the score,
but this is true for all players equally. So it doesn't make it easier to be better than others at the map, which is what determines the PP value.

How do you upload math equations as images quickly?

Write the maths in LaTeX using this website, then download that as a png, and then drag that png to the ShareX (I can't use puush because they aren't accepting new users at the moment) window so that it uploads it and gives you an url.

(Why forum not already have LaTeX support?!)

Last edited by Mio Winter 2018-07-21T21:43:45+00:00, edited 13 times in total.

Full Tablet

2,543 posts

Joined September 2011

Full Tablet 2017-11-24T00:21:29+00:00

The formula used for the ranking-based pp tends to give a lot more pp for maps that have more plays.

Using the average score is not a good indicator of the difficulty of the map, easier maps tend to have relatively few good players attempting plays (mostly people who hunt for score ranks only) and a lot of players who aren't good at the game in general.

Here is an example of another ranking system (for osu!mania) based on statistical data only:
https://osu.ppy.sh/forum/t/329678
It is based on calculating "Difficulty Curves" for each beatmap ("Player Skill" required to achieve certain score in the beatmap), and simultaneously fitting and those curves and the "Player Skill" values each player has (based on the scores they have set). It requires an enormous amount of computing power to calculate, though (here it took several months to calculate the rankings with only ~2000 players, with the time required to calculate increasing approximately quadratically with the amount of players and linearly with the amount of beatmaps).

Last edited by Full Tablet 2017-11-24T00:23:12+00:00, edited 3 times in total.

E m i

E m i 2017-11-24T00:51:19+00:00

40% of pp system's retardation can be fixed by halving the negative impact of combo on aim pp and doubling the negative impact of miss count on aim pp.

Literally brainlessly halve and double, no complicated formulas or curves or algorithms are required

So how about doing that before moving on to 8000% harder improvements with 5% of the value

Mine solution - Do that, then turn ppv2 into SCORE, and then filter ppv2 values through the ppv1 amplification

Involving scorev1 or scorev2 at all is a garbage in, garbage out situation

repr1se

482 posts

Joined May 2014

repr1se 2017-11-24T01:33:12+00:00

The complaints all seem to be some form of "the PP system does not weight X enough", where X can be skills like consistency, accuracy, sliders, tech reading, low-AR reading, etc.
-the majority of these people just want THEIR skill weighted more, not necessarily because they genuinely believe that the skill is underrepresented

The essential difference between PP!Balance and PPv2 is that, unlike PPv2, PP!Balance doesn't try to directly define what it is exactly that is difficult in a beatmap (e.g. aim, accuracy, speed, reading, finger control, stamina and so on). Instead it lets the difficulty of a map be determined empirically by players' ability to play it.
-in principle, it’s a nice idea, but there are a lot of shorts in the idea once you get down to it

RANK BASED SUPERIORITY
N is the number of other plays made on that map (including your own plays and failed plays)
-the game is centered around high scores, I don’t think failed plays should be counted. I don’t think # of plays should be counted at all. If a high score was set but you took 1000 tries, should it be worth less than the same high score set with 100 tries?
The more plays-within-a-year a beatmap has, the more PP you get for getting a high score on it.
-this is pp farming, Reol No Title gets top PP, no argument

The reason why the PP of a play should depend on N is that the more people try a beatmap, the higher the competition is for getting high ranks on it. It is more difficult to win competitions where lots of other players compete
-so 4.5 star tv size maps should be worth disproportionally more PP than the 6* maps just because the 4 stars are played more?

k is a constant weight that determines the importance of N.
-how exactly will this be set? Manually determined by a person? Generated?

The second term (1/ln(stuff)) is supposed to scale inversely logarithmically with your rank on a beatmap.
-this is a problem because the usual PP system does not include things like spinner wars… but this will encourage it. For scores that have many HDHR FCs, you’re telling me that with a score difference of 1000, the second place should get 0.76x the PP of the #1?

SCOREv1 BASED SUPERIORITY
-There are major problems with this. The first is that score averages are dynamic, not static. It’ll require shitloads of calculations every day. Not only that, it makes the PP system dynamic and relative, which defeats the whole purpose of the PP system, which is to display your top plays.

The idea assumes that average score on a beatmap is a good proxy for how difficult it is.
-this won’t work. Let’s say there’s a map that’s really easy for tv size, but at the middle there’s a part that’s unreadable. God knows how that’s ranked, but let’s suppose that it is and EVERYONE misses on that part. Should the one guy that smashed keys, got all 100s and proceeded to FC get an amazing amount of PP compared to the rest? Ame to Asphalt is another good example, it’s easy until the very end which has 6.5* jumps for 16 measures iirc. So average score will be high, and the guy that hits all the jumps will get lower pp than he deserves.

If beatmap leaderboards will continue to stay ScoreV1, then I think the solution to this is just to keep two leaderboards for each beatmap, one for each scoring method, and then just make the ScoreV2 leaderboard hidden. That way, you can make plays that have a higher ScoreV1 value than ScoreV2 value without losing PP.
-this defeats the whole purpose of the scoreboard system. The leaderboards lose their prestige since people will try to datamine how to get PP and only focus on the hidden scorev2 leaderboard.

the more failed scores a beatmap has, the more PP should be given. One way of dealing with this could be to just submit failed scores (i.e. their ScoreV2 values up until the point they failed) into a hidden database for that beatmap, and then include those scores when calculating the average score on that beatmap.
-this can be exploited by spamming maps (I see multi lobbies with rank 100k players playing freedom dive all the time), which reduces the pp of the people that actually mastered the map. This again makes the pp system dynamic and relative, which I already wrote.

Anyway, here's the formula I think has the best scaling. I won't go into how I derived this, but if you need to know you can ask me.
-log is a unitless function…
We give 0 PP to all scores below it.
-doesn’t work, because every play requires some display of skill… you’re telling us that “plays under this score require no displays of skill, therefore no pp”

k_1 and k_2 are constants that determine the range of PP values.
-how will this be determined?

To achieve 1256 PP, you need to score a thousand times better than average.
-again… this makes the pp system dynamic

play more doesn't work if you're playing with a bad habit

see my profile for Tips and FAQ.

repr1se

482 posts

Joined May 2014

repr1se 2017-11-24T02:10:18+00:00

fuck you i'm double posting

the biggest complaint against the pp system is "tv size jumps" and "dt pp farm." while those complaints aren't baseless, they're ignoring that tv size jumps are short and dt pp farm has higher od, so tv sizes are gimped and dt requires more precision in the jumps.

i think the current pp system's principles are fine as is. using the map itself to determine the difficulty. there are issues with it, no doubt. for example:
technical maps are undervalued because the system sees circles and sliders just one after another, whereas we humans have to read it
jump maps are overvalued because the system sees the distance between the circles and takes the velocity required to hit the jumps, while we humans can flick it
stream maps are undervalued because the cursor is not very mobile compared to remote control
etc.

there are other things that the pp system leaves out, like unstable rate and overall stability of the player. i'm going to propose some generic equations where specific numbers can be filled in later yada yada yada

this proposition encourages stability in accuracy and increases the amount of pp you get from high OD (like HR):
https://gyazo.com/ab41ade7e2796cafc99fd3c5b2f6276b

this proposition reduces the amount of pp awarded in jump maps, to align with stream and technical maps, as well as reducing the pp of slider-only maps:
https://gyazo.com/96a13706bf510199adfae034a234e471

this proposition increases the pp value of technical maps.
https://gyazo.com/591ea23d810d681155da6e7d93d88b71

do i think these changes have to be implemented? no. but i think it does address the concerns of people whining about the current pp system.

play more doesn't work if you're playing with a bad habit

see my profile for Tips and FAQ.

E m i

E m i 2017-11-24T02:59:53+00:00

best acc scores with second tier other aspects

best speed scores with second tier other aspects

Where the fuck are the other scores? Well, on pp plus and not on anyone's profile (lazy)

best... acc AND speed scores with only second tier aim?

We all know this will only require 1 screenshot

No!!!

No!!! Not about MY skill

If acc and speed were buffed I still wouldn't like my top ranks so the only thing I'm whining about for myself is separating skills into different pp categories

abraker

Global Moderator

8,327 posts

Joined July 2014

abraker 2017-11-24T03:43:26+00:00

Momiji wrote:
Involving scorev1 or scorev2 at all is a garbage in, garbage out situation

std skin 2021: link | mania skin 2021: (vanilla ver ~ hidden ver)
osu!Skills - Compare your skills in a slightly different way
OT!neus - osu off-topic subforum's very own discord server

winber1

4,520 posts

Joined February 2010

winber1 2017-11-24T04:55:43+00:00

Literally the only real solution is to make the difficulty calculator better. All other cases have too many edge cases where things can just go wrong. Ranking-based system was ppv1 and it gave too much pp for anime maps because those were by far played much more. Of course scarlet rose was just a huge name in general as well as a few other well known "difficult" maps, so those were relatively adequately weighted for how difficult it was. You could even argue to have a combination of ppv2 and ppv1, but then things like haitai would be even more broken and overly weighted. You can come up with equations all you want acting as if you are coming up with good ideas, but unless you actually show hard data of a equation that actually works reasonably well with hundreds of edge cases, there will only be skepticism.

And as said, average score based systems suffer tremendously from outliers. No fail in particular will skew the distribution way down, and if you disable no fail as a ranked mod and assume no blatant outliers, then it still suffers because of the variation in popularity of maps. You also run into problems with newly ranked maps. Should you just disable rank until a certain number of people rank on it? Furthermore, your pp will be constantly changing as more people play. This is just really bad design in general. When you make a play, its worth shouldn't really change over time imo.

Any system which hinges on the player-base is bound to encounter popularity issues and outlier issues, sooner or later you will find the outliers just like how people have found outliers in the current pp system. I honestly think the only real solution is to improve the difficulty calculator, or invest some deep learning data scientist to make a computer model that will be able to "predict" what the player base will think of the difficulty of said maps (and any anomalies can be reported and sent to the machine to improve its understanding of difficutly).

my code doesn't work still

B1rd

2,974 posts

Joined December 2013

B1rd 2017-11-24T05:11:03+00:00

The only real solution is to dispense with the PP system altogether. Then instead of people scrambling to get the top PP plays, they will be scrambling to get the best plays as defined by themselves and their peers. In other words, an algorithm challenges the human mind to exploit it. Having no PP system uses the human mind itself as a judge of performance, which is vastly superior to any PP system.

#downwithPP

winber1

4,520 posts

Joined February 2010

winber1 2017-11-24T05:13:26+00:00

B1rd wrote:
The only real solution is to dispense with the PP system altogether. Then instead of people scrambling to get the top PP plays, they will be scrambling to get the best plays as defined by themselves and their peers. In other words, an algorithm challenges the human mind to exploit it. Having no PP system uses the human mind itself as a judge of performance, which is vastly superior to any PP system.

#downwithPP

That's basically what the score system was lmao

people actually specialized in FL plays and tried to get insane plays on quirky maps, and then pp killed them all

my code doesn't work still

repr1se

482 posts

Joined May 2014

repr1se 2017-11-24T05:25:05+00:00

winber1 wrote:
You can come up with equations all you want acting as if you are coming up with good ideas, but unless you actually show hard data of a equation that actually works reasonably well with hundreds of edge cases, there will only be skepticism.

what you fail to realize is that we don't have data to throw into the equations, so we're stuck with generic forms. unless you have a splendid idea...

Any system which hinges on the player-base is bound to encounter popularity issues and outlier issues, sooner or later you will find the outliers just like how people have found outliers in the current pp system.

then give us a new pp calculation if you're just going to try and shit on others' ideas

I honestly think the only real solution is to improve the difficulty calculator, or invest some deep learning data scientist to make a computer model that will be able to "predict" what the player base will think of the difficulty of said maps (and any anomalies can be reported and sent to the machine to improve its understanding of difficutly).

are you bullshitting lul

play more doesn't work if you're playing with a bad habit

see my profile for Tips and FAQ.

E m i

E m i 2017-11-24T05:40:26+00:00

winber1 wrote:
Literally the only real solution is to make the difficulty calculator better. All other cases have too many edge cases where things can just go wrong. Ranking-based system was ppv1 and it gave too much pp for anime maps because those were by far played much more. Of course scarlet rose was just a huge name in general as well as a few other well known "difficult" maps, so those were relatively adequately weighted for how difficult it was. You could even argue to have a combination of ppv2 and ppv1, but then things like haitai would be even more broken and overly weighted. You can come up with equations all you want acting as if you are coming up with good ideas, but unless you actually show hard data of a equation that actually works reasonably well with hundreds of edge cases, there will only be skepticism.

Lol, it's impressive how you gave something so obvious and so good that still seemingly 0 people have said before. The disproportions of PP are literally mirrored in the star rating system, of course excluding how low IQ acc PP is (way too separate issue). Thumb up

repr1se

482 posts

Joined May 2014

repr1se 2017-11-24T06:06:15+00:00

Momiji wrote:
winber1 wrote:
Literally the only real solution is to make the difficulty calculator better. All other cases have too many edge cases where things can just go wrong. Ranking-based system was ppv1 and it gave too much pp for anime maps because those were by far played much more. Of course scarlet rose was just a huge name in general as well as a few other well known "difficult" maps, so those were relatively adequately weighted for how difficult it was. You could even argue to have a combination of ppv2 and ppv1, but then things like haitai would be even more broken and overly weighted. You can come up with equations all you want acting as if you are coming up with good ideas, but unless you actually show hard data of a equation that actually works reasonably well with hundreds of edge cases, there will only be skepticism.
Lol, it's impressive how you gave something so obvious and so good that still seemingly 0 people have said before. The disproportions of PP are literally mirrored in the star rating system, of course excluding how low IQ acc PP is (way too separate issue). Thumb up

Hate to break it to you but he's bullshitting

1) difficulty calculator will still have "too many edge cases where things can just go wrong" even when nearly perfect especially because many factors have to condense into one number

2) Suggests improving the difficulty calculator, but won't use equations
And no specifics on how to improve it. Literally pretending to come up with good ideas. At least Mio and I actually went into some fucking detail

3) tells us to use data that's available to no one, yet presents opinion as if it was based on such data

play more doesn't work if you're playing with a bad habit

see my profile for Tips and FAQ.

winber1

4,520 posts

Joined February 2010

winber1 2017-11-24T06:10:58+00:00

repr1se wrote:
winber1 wrote:
You can come up with equations all you want acting as if you are coming up with good ideas, but unless you actually show hard data of a equation that actually works reasonably well with hundreds of edge cases, there will only be skepticism.
what you fail to realize is that we don't have data to throw into the equations, so we're stuck with generic forms. unless you have a splendid idea...

Any system which hinges on the player-base is bound to encounter popularity issues and outlier issues, sooner or later you will find the outliers just like how people have found outliers in the current pp system.
then give us a new pp calculation if you're just going to try and shit on others' ideas

I honestly think the only real solution is to improve the difficulty calculator, or invest some deep learning data scientist to make a computer model that will be able to "predict" what the player base will think of the difficulty of said maps (and any anomalies can be reported and sent to the machine to improve its understanding of difficutly).
are you bullshitting lul

bruh

i did not say anything about how i could make a better system, I literally just gave my two cents on why trying to help is very difficult or meaningless. And furthermore, you fail to realize that we actually have quite a large amount of data to play around with (though not the entire database of course) as long as you are willing to spend the time to find it, and any actual dedicated person would be willing be model data to test their hypothesis. Literally that's what 90% of science is, modeling your hypothesis. Scientific data and evidence are not found roaming the wild Lol.

and no i am not bullshitting, deep learning and computer ai is actually coming to very reliable point in this day and age. I would assume facial recognition is bounds more difficult to create than a beatmap difficulty star generator, and yet we have a extremely high if not 100% success rate for a particular "type" of facial image (which of course is because facial is more important than beatmap difficulty out in the real world). But with any machine learning, you're gonna have to spend time to get it working and spend some time finding people to help you label and train your data. We would have one of the most accurate difficulty calculators by far if we just invested in a computer model, but of course i would assume most people on osu do not have a phd in such a subject, and nor do i.

my code doesn't work still

ManuelOsuPlayer

1,565 posts

Joined July 2014

ManuelOsuPlayer 2017-11-24T09:49:06+00:00

Your system proposal dosn't make sense. https://osu.ppy.sh/s/24313 (airman) #1 score would probably be more PP worth than https://osu.ppy.sh/s/374115 (apparition) DT+HR+HD+FL.
Your ratting system value how many low ranks played a map.
Also don't make a difference between real map diff.
Also new players wouldn't move from 0 PP because they have to compete directly with cookiezi, rafis... and all their scores, being this ones at average scores. Also have to compete with people like me who sometimes play DTHRHDFL at 1*-2* maps for fun. There is no chance for a new player to compete with tons of players upping average scores at easier maps. They would think it's imposible improve ranks and quit xD
Your system incentive players like me to go for leaderboard at 1* maps where new players have an average of 70% acc instead 5* new ranked maps what have an average about +90%.

Also easy to learn maps what have really bad scores due to sightread plays will be more worth than replayed maps where everybody up their scores.

https://osu.ppy.sh/s/4039 This map for example, i'm sure it has an average around 10%-20% acc and more than 80% of plays die at first patterns. For your system this map it's really hard, when it's not. So i'll learn it and for example after 300 plays get at #1 leaderboard with DT+FL and get insane amount of PP sniping top #50 players by playing similar maps what i know players have bad average scores on it. Also, since i did my DT+FL SS first than anyone else, i'll be #1 for ever, because when people draw my score, mine still #1 because i did it first. Then same score wouldn't worth the same for all players. Also players wouldn't care to play that map anymore because i already have #1 on it and they can't get #1 anymore, so will be more worth to search for a similar map where they can get #1 than snipe mine, making my PP stay there for a good good while and my rank way bigger than my actual skills, being really unfair to new other players and after all to new players what are joining the game.

Also at high level players can make new accounts to set bad scores at other player topranks. And with this system you can be sure it's going to happen.

At the end the ranks will be totally random.

If a new PP system come out should be an improvment with small changes from actual.
Or a PP system made by players ratting. Where players rate maps under their skills to don't exploit it. For example my lowest toprank PP It's 89pp. I should be able to rate maps what i can get an S to give from 1pp to 88pp. Make and average of all players rate at an expecific map and make a second average with the actual PP system, getting the final value(value what change over the time) so isn't a really big change from ranks to everyone, since the actual PP system decide 50% of the PP a map give.(Or even more If staff decide it)
At high ranks top players should be able to rate over their lowest PP rank. For example players who had made it at least once into #1-#999 the current year have the change to rate the PP a map gives even being higher than their lowest PP toprank.

Since skills change, maps change, and new things come out, players are the best to adapt actual PP system.
A bad use from this ratting system (exploit/trolling/unfair competition) will be a ban from PP ratting system from a year.
Anyone could report a player who is trolling ratting and get a ban from PP rating system from a good while.
The implementation would be really easy and wouldn't make any difference on players ranks if you made the PP system be 90% and new PP players ratting system only change 10% of the actual PP.
Practical example. A map give 100PP. Players average rate it to worth 50 PP instead. 90pp + 5pp = 95PP. Once X number of players have rated a map, and the system it's tested enought, the players ratting will be more worth.
It's a good system i think and can be slowly implemented.

The PP a player rated a map should be public to be able to disccusion and reevaluated.
Since the PP rated for a map it's public, and ordered by lowest rated and highest rated you can find really easy If someone rated 200pp to a 1* map or 1pp to a 5* map and report it. After X reports that ratting gets eliminated from ratting list. If a player get X number of bad rated reports will be unabled to rate X time, being this time banned from ratting increased each time one of his ratting numbers get's reported. Top #1-#999 players have their own ratting-report system to don't be reported randomly.

This is only an idea how it should be on my opinion. Not a proposal. If anyone want to make a formal proposal from this idea it's free to do it. My english isn't good enought to make a good proposal and express myself, and also it's an idea, not something i have worked on to make it more viable or less.

Worst osu player effort/reward.

Sightreading Pass:
9/08/2018 The Big Black
10/08/2018 Lia - The Never Ending Love (DJ Shimamura's XTC Remix)
10/08/2018 Team.NEKOKAN - Airman ga Taosenai - Blown by the wind
13/08/2018 Camellia - Insecticide - [Arles AR10]
13/08/2018 GYZE - HONESTY
20/08/2018 Seiryu - AO-Infinity

chainpullz

2,334 posts

Joined June 2013

chainpullz 2017-11-24T14:53:52+00:00

winber1 wrote:
extremely high if not 100% success rate for a particular "type" of facial image (which of course is because facial is more important than beatmap difficulty out in the real world).

Facial detection maybe. Identification and landmarking (the latter in particular) are both still rather difficult for computers not to mention in many cases you would like it to be done in real time which isn't very practical in a computational sense. If you have a face in the standard straight on chin level pose then it's usually pretty easy for a computer. Anything more than a slight variation in pose reduces the success rate dramatically even when trained on data with high variation in poses.

Edit: It's a common misconception that it takes a higher level degree to be a data scientist. The difference between the graduate and undergraduate courses is all theoretical usually, not application based. You definitely don't need to know the theoretical underpinnings to use one of the nice learning libraries like sklearn to get the job done.

I'm slow like a turtle. Plz no bully, I hide in my rock hard shell.

Modding Queue

winber1

4,520 posts

Joined February 2010

winber1 2017-11-25T00:55:32+00:00

i've heard china is getting extremely close to a facial password, assuming you give it a very particular kind of photo. i'm not exactly assuming that you can detect someone from all angles, literally just at a particular frontal view with very similar lighting

it's true you can basically become a self-proclaimed data scientist by just taking online courses, but you still have to spend tons of time to really understand the capabiliies and pitfalls of whatever you're using. it's almost not too different than just getting a higher degree (though of course it will take less time since you will basically be specially in literally just one thing I guess)

my code doesn't work still

chainpullz

2,334 posts

Joined June 2013

chainpullz 2017-11-25T07:10:01+00:00

If you give Tom's pp algorithm a very particular kind of map it does a very good job of identifying pp values. I'm not assuming it can detect all sorts of different patterns, overlaps, and angles, literally just at a particular bpm with simple spacing, patterns, and a rhythm like a metronome.

Grad courses teach you how to derive and prove things. Most data science is done by brute force testing different initial conditions and following basic rules of thumb. A Masters dosent even require research and a PHD is mainly a way to obtain experience as well as publications and a professional network.

They offer machine learning, vision, etc. as upper level undergrad courses minus the theory because CS majors can't do proofs to save their lives and while possibly useful it definitely isn't necessary. Applied machine learning isn't rocket science.

I'm slow like a turtle. Plz no bully, I hide in my rock hard shell.

Modding Queue

Edgar_Figaro

919 posts

Joined May 2015

Edgar_Figaro 2017-11-25T08:59:10+00:00

B1rd wrote:
The only real solution is to dispense with the PP system altogether. Then instead of people scrambling to get the top PP plays, they will be scrambling to get the best plays as defined by themselves and their peers. In other words, an algorithm challenges the human mind to exploit it. Having no PP system uses the human mind itself as a judge of performance, which is vastly superior to any PP system.

#downwithPP

What’s even the point of this being an online game if there is no ranking/scoring system? Might as well just play offline 100% of the time as my plays won’t even matter and I’ll get slightly better performance (game stability) from playing offline.

I know “removing ranking system” sounds like a good idea to people but this only stems from long periods of disgruntlement from the flaws in the systems. Osu is a competitive game and many players (myself included) find the competitive aspect to be a massive draw. If there was no ranking system I tbh wouldn’t play half the songs on Osu as they aren’t my preferred music choice, Also I’d care a lot less about getting better as “what’s the point” and I’d play this game a lot less than I currently do.

TL:DR: Ranking systems are a major aspect of players desire to play games and shouldn’t be removed just to appease the hipsters

B1rd

2,974 posts

Joined December 2013

B1rd 2017-11-25T09:36:25+00:00

Edgar_Figaro wrote:
B1rd wrote:
The only real solution is to dispense with the PP system altogether. Then instead of people scrambling to get the top PP plays, they will be scrambling to get the best plays as defined by themselves and their peers. In other words, an algorithm challenges the human mind to exploit it. Having no PP system uses the human mind itself as a judge of performance, which is vastly superior to any PP system.

#downwithPP
What’s even the point of this being an online game if there is no ranking/scoring system? Might as well just play offline 100% of the time as my plays won’t even matter and I’ll get slightly better performance (game stability) from playing offline.

I know “removing ranking system” sounds like a good idea to people but this only stems from long periods of disgruntlement from the flaws in the systems. Osu is a competitive game and many players (myself included) find the competitive aspect to be a massive draw. If there was no ranking system I tbh wouldn’t play half the songs on Osu as they aren’t my preferred music choice, Also I’d care a lot less about getting better as “what’s the point” and I’d play this game a lot less than I currently do.

TL:DR: Ranking systems are a major aspect of players desire to play games and shouldn’t be removed just to appease the hipsters

If only there was a way to judge the skill or players and plays without an arbitrary number...

repr1se

482 posts

Joined May 2014

repr1se 2017-11-25T15:17:59+00:00

B1rd wrote:
If only there was a way to judge the skill or players and plays without an arbitrary number...

there isn't... LoL uses MMR, Dota2 uses ELO, CS:GO uses Glicko2, ESEA uses RWS, CEVO uses efficacy, Hearthstone uses MMR, SC2 uses ELO
ELO was designed for chess

you get the point
there's no way to rank people without using numbers

play more doesn't work if you're playing with a bad habit

see my profile for Tips and FAQ.

B1rd

2,974 posts

Joined December 2013

B1rd 2017-11-25T15:40:23+00:00

There is.

Tamako Lumisade

204 posts

Joined October 2012

Tamako Lumisade 2017-11-25T16:12:05+00:00

Remove pp.
pls enjoy game

my name is hi

119 posts

Joined August 2014

my name is hi 2017-11-25T18:11:43+00:00

B1rd wrote:
There is.

Then please enlighten me, because I don't have a single clue of what it could be while you seem to have a pretty good idea

repr1se

482 posts

Joined May 2014

repr1se 2017-11-25T20:30:34+00:00

B1rd wrote:
There is.

Boaring wrote:
Then please enlighten me, because I don't have a single clue of what it could be while you seem to have a pretty good idea

"I have discovered a truly marvelous proof of this, which this margin is too narrow to contain." -Pierre de Fermat

in other words, he knows but we're too dumb to understand, so he's making us figure it out

play more doesn't work if you're playing with a bad habit

see my profile for Tips and FAQ.

B1rd

2,974 posts

Joined December 2013

B1rd 2017-11-26T09:59:20+00:00

Well it shouldn't be that hard to figure out considering I just told you what the method was. That method being your brain. I don't need to look at the rankings to say that Cookiezi is the best player. I can look at Freedom Dive HDHR and multitude of other scores he was made and use that to determine he is the best player. Of course it's not easy to neatly rank people from 1-100,000,000, but that's how it should be, because the game is so complex with all the skills that is very difficult to make an objective ranking of overall skill. You can't look at any one player in the top 100 and say for sure that he is worse than all those above him and better than all those below him, because the PP system only measures a specific skillset. And it's not hard even without the PP system to rank players according to a few singular aspects. However we shouldn't pretend that those few aspects that are measures by the PP system create an objective ranking of skill, and then only compete in those few skills, such as we do now. Just because you get rid of the PP system, doesn't mean it gets rid of the competitive aspect of the game; it's just that instead of competing on farm maps like Daidai Genome and Haitai, they will be competing on actual good maps. As far as I know, other rhythm games like Stepmania or Beatmania don't have any rankings like osu! does, so you can't complain that it would ruin the game.

Zid

9 posts

Joined March 2017

Zid 2017-11-26T10:55:49+00:00

B1rd wrote:
As far as I know, other rhythm games like Stepmania or Beatmania don't have any rankings like osu! does, so you can't complain that it would ruin the game.

Not sure about Stepmania but I know Beatmania uses the Dan System, where players are required to pass a course consisting of 4 maps that each test different skillsets of the game. If they successfully pass the course they rank up and they do this up until they reach Kaiden, which is the highest rank you can acquire.

Not disagreeing with your point though, just wanted to explain the rankings for anyone that doesn't play.

“We are the music makers and we are the dreamers of dreams.”

chainpullz

2,334 posts

Joined June 2013

chainpullz 2017-11-26T14:25:27+00:00

B1rd wrote:
<snipped>

A true ranking system should be well ordered. What you described is only a partial order and that's assuming it's even transitive. I'm not saying you're wrong about removing the ranking aspect, just that what you described does not produce a ranking.

After having experienced a rhythm game arcade environment and the mentality it breeds I personally think osu would be better off without rankings. There is no mid-map retry in arcade versions and people mainly play to beat themselves as opposed to others. The mentality is never "yay my rank went up." Rather, it's something more like "holy shit I finally did it."

The Dan system would probably be alright too since, again, it's more about beating yourself than trying to rank you against everyone else.

I'm slow like a turtle. Plz no bully, I hide in my rock hard shell.

Modding Queue

repr1se

482 posts

Joined May 2014

repr1se 2017-11-26T14:37:24+00:00

B1rd wrote:
Well it shouldn't be that hard to figure out considering I just told you what the method was. That method being your brain. I don't need to look at the rankings to say that Cookiezi is the best player. I can look at Freedom Dive HDHR and multitude of other scores he was made and use that to determine he is the best player.

this leads to subjectivity as to what constitutes a good player. especially with the number of players complaining that DT is not a skill, and jumps aren't skill (which... is retarded). while the pp system isn't perfect it clearly shows that cookie is better than i am

Of course it's not easy to neatly rank people from 1-100,000,000, but that's how it should be, because the game is so complex with all the skills that is very difficult to make an objective ranking of overall skill. You can't look at any one player in the top 100 and say for sure that he is worse than all those above him and better than all those below him, because the PP system only measures a specific skillset. And it's not hard even without the PP system to rank players according to a few singular aspects. However we shouldn't pretend that those few aspects that are measures by the PP system create an objective ranking of skill, and then only compete in those few skills, such as we do now.

you just wrote that you determined cookiezi to be the best player
i beg to differ. the top ranks have their variety of skillsets, especially rohulk and idke
do i think that pp overvalues certain aspects of the game? yes. do i think that pp measures ONLY one skillset? no.

Just because you get rid of the PP system, doesn't mean it gets rid of the competitive aspect of the game; it's just that instead of competing on farm maps like Daidai Genome and Haitai, they will be competing on actual good maps. As far as I know, other rhythm games like Stepmania or Beatmania don't have any rankings like osu! does, so you can't complain that it would ruin the game.

then what constitutes a good map? whatever answer you give is subjective and everyone's opinions vary, you can't just redo the pp system to have "actual good maps" and satisfy everyone. undoubtedly someone will say "your 'good maps' are shit, give me the old pp system"

i tried searching for beatmania and stepmania leaderboards and didn't find anything. osu is the only game i know of with a leaderboard this hardcore and that's what sets it apart.

play more doesn't work if you're playing with a bad habit

see my profile for Tips and FAQ.

B1rd

2,974 posts

Joined December 2013

B1rd 2017-11-26T15:56:05+00:00

chainpullz wrote:
B1rd wrote:
<snipped>
A true ranking system should be well ordered. What you described is only a partial order and that's assuming it's even transitive. I'm not saying you're wrong about removing the ranking aspect, just that what you described does not produce a ranking.

After having experienced a rhythm game arcade environment and the mentality it breeds I personally think osu would be better off without rankings. There is no mid-map retry in arcade versions and people mainly play to beat themselves as opposed to others. The mentality is never "yay my rank went up." Rather, it's something more like "holy shit I finally did it."

The Dan system would probably be alright too since, again, it's more about beating yourself than trying to rank you against everyone else.

I really meant that you can judge player skill without a ranking system, though you can still rank players based on certain criteria, you just don't get to universalise your arbitrary ranking system. Although you could say that judging people by their scores is still judging people based on numbers, but no reason to get overly semantic.

repr1se wrote:
this leads to subjectivity as to what constitutes a good player. especially with the number of players complaining that DT is not a skill, and jumps aren't skill (which... is retarded). while the pp system isn't perfect it clearly shows that cookie is better than i am

you just wrote that you determined cookiezi to be the best player
i beg to differ. the top ranks have their variety of skillsets, especially rohulk and idke
do i think that pp overvalues certain aspects of the game? yes. do i think that pp measures ONLY one skillset? no.

then what constitutes a good map? whatever answer you give is subjective and everyone's opinions vary, you can't just redo the pp system to have "actual good maps" and satisfy everyone. undoubtedly someone will say "your 'good maps' are shit, give me the old pp system"

i tried searching for beatmania and stepmania leaderboards and didn't find anything. osu is the only game i know of with a leaderboard this hardcore and that's what sets it apart.

Yeah, because player skill is somewhat subjective. There are so many skills in the game that it's hard to say that any certain skill is better. Although you can have a in-depth discussion about that.

What you're saying is that you want the game to prioritise certain aspects of the game, like aim and speed, over others. Well if people don't want to recognise skills like DT, then so what? You shouldn't try to force other people to respect you. People should be able to make their own judgements about skill and form their own sub communities based around certain aspects of the game. You see this with the EZ community, and this is somewhat because EZ and reading maps are so far removed from the PP system that EZ players aren't pushed into competing over PP like other players are. I think people who complain about DT aren't actually saying that DT takes no skill, but rather it takes no skill relative to other skills that give the same PP, which is pretty much true. You already admitted that the PP system isn't an objective criteria of skill as you denied Cookiezi to be the best when he has the most PP. So you should be all for getting rid of the PP system so that players with unique skillsets aren't overshadowed by players with slightly more PP.

The problem with the PP system is that is is so pervasive and affects people who don't care for it. Yeah, I do believe that some maps are objectively better than others; I don't regard copy paste PP maps to be as good as maps by Lan Wings for example. And it's the result of the PP system that people who don't like PP mapping and aren't interest in competing in that specific skillset are marginalised.
You have to realise that mapping is basically like a market place. Mappers like to make maps that are popular and well received by others. Therefore, mappers generally cater to what the community likes. With the existence of the PP system, the majority of players play for PP, therefore they play maps which give lots of PP and have skillsets catered towards the skills which give PP. Therefore, there is a large demand for PP maps, so the majority of mappers will make PP maps, and even the mappers who don't explicitly map for PP will have their map design largely influenced by the meta and the skillsets of the majority of players. That's why you don't see many AR7 or 8 maps these days, and in fact its quite difficult to get a map ranked that goes against the meta. That's why it's not valid to say that if the PP system were to go away that what is currently regarded as PP maps will disappear and the players will be left wanting. My argument is, in essence, that the PP system means that players play maps that give PP rather than what they consider good or fun to play. If it were to go, then players would play and compete on maps that they and the community in general thought were good and fun, and therefore the demand and thus supply of maps would shift towards what the community regarded as good and fun. It would also create a more diverse range of maps that focused on larger degree of skillsets, and there wouldn't be a ranking system that arbitrarily chose a few skills to be the most important ones.

Really, I don't regard osu! as any more "hardcore" than Stepmania or Beatmania. Looking at your average Beatmania or Stepmania player, I wouldn't consider them not to be hardcore.

And yes I said that Cookiezi is the best player - because he is. No one can match Freedom Dive HDHR, no one can match a plethora of his other plays either. He has amazing aim, the best streaming consistency in the game - not only that, he has great reading, and an exceptional ability to play difficult patterns and has very good acc. You might find players that are better than him in one or two aspects, but there is no one who is so exceptionally good in so many areas. He is in a league of his own.

Edgar_Figaro

919 posts

Joined May 2015

Edgar_Figaro 2017-11-26T16:09:59+00:00

I don't feel like quoting all of B1rd, but he is pretty much saying get rid of ranking altogether and just go based on individual map scores but not have that affect any sort of ranking system.

Here's the thing, people LIKE the PP system (for the most part). Before the PP system even existed there was PPV1 which was a ranking system just based on map ranks which is pretty similar to what OP suggested. People wanted a system that more accuratrely measured skill between players and thus came around Tom's TP system.

So even BEFORE the current PPV2 was in-place there was a demand for a system to rank players with more quantifiable numbers. Also we have other things currently like OsuSkills and the sort.

In-short, most people want a ranking system of some sort and not just to arbitrarily decide that "oh this perosn is good". That only really works for about 50 or so players who people have even heard of and leaves everyone into obscurity. It just would lead to a system where the only people who are ever seen as "good" have to start streaming themselves on twitch and gather a following so people recognize their skills.

PP is a nice way for people to see their improvement in the game and lets them know how much they've improved. Is PP perfectly balanced? Well no. Will there always be maps made to try and get more play based on popularity through PP? Probably yes. Do I think these problems means that "no ranking system" would be a better alternative? Absoloutely not.

If PP is removed, all it'll do is cause people to start using a system similar to TP which wasn't official and start ranking that way as trying to just go based on a "word of mouth" system to decide who is good at what skills would be extremely obnoxious.

Just to use myself as an example. In Taiko I am a primarily Hard Rock Specialized player. I am pretty good at high bpm hard rock play and have very good high scroll reading. Now am I the very best Taiko Hard Rock player to ever play this game? No not by a long shot. Now if we simply by a "word of mouth" system. The only people who would ever be recognized would be the very best of the best. While I do have a few impressive HR scores on maps that most people don't HR, it still wouldn't be enough to get my name recognized as being a well known HR player as there are people out there much better at the skill. Basically removing ranking systems means you're either the best or your not. People would have no incentive to improve as they'd feel like the top players were out of reach and until they reached the pinnacle they'd never get recognized for their achievements.

B1rd

2,974 posts

Joined December 2013

B1rd 2017-11-26T16:47:48+00:00

Speak for yourself.

How does the current ranking system take people out of obscurity? People don't even know all of the players in the top 50, let along random players in the top 1000. People don't go searching through the rank list to find players, it makes zero difference to the publicity you get. You can easily judge the skill of players. Just replace the ranks in your profile by a showcase of your plays. People can look at a random sample of your best plays that you choose to show and use their knowledge of those maps as a benchmark to judge your skill. And you can judge your own skill and your own improvement by the plays you get, just as easily as using an arbitrary number. If we used this system, it would be much more conducive to publicizing your best plays, because currently, you can make amazing plays but no one will know if those plays aren't worth PP.

repr1se

482 posts

Joined May 2014

repr1se 2017-11-26T17:18:29+00:00

B1rd wrote:
What you're saying is that you want the game to prioritise certain aspects of the game, like aim and speed, over others. Well if people don't want to recognise skills like DT, then so what? You shouldn't try to force other people to respect you.

where have i made that claim? i only mentioned that too many bitches whine about DT without recognizing that it actually does take skill to play. and i'm not even a DT player.

I think people who complain about DT aren't actually saying that DT takes no skill, but rather it takes no skill relative to other skills that give the same PP, which is pretty much true. You already admitted that the PP system isn't an objective criteria of skill as you denied Cookiezi to be the best when he has the most PP. So you should be all for getting rid of the PP system so that players with unique skillsets aren't overshadowed by players with slightly more PP.

where have i made that claim? i was pointing out that you wrote that you can't determine who's the best, but you also write that cookie was the best
rohulk bro

The problem with the PP system is that is is so pervasive and affects people who don't care for it. Yeah, I do believe that some maps are objectively better than others; I don't regard copy paste PP maps to be as good as maps by Lan Wings for example. And it's the result of the PP system that people who don't like PP mapping and aren't interest in competing in that specific skillset are marginalised.

just about all of the top players don't care for pp. they play for fun and pp is gained on the side.
lul maps are subjective, you can say that hollow wings is a great mapper and i could say he's shit, but there's no objectivity behind either statement

You have to realise that mapping is basically like a market place. Mappers like to make maps that are popular and well received by others. Therefore, mappers generally cater to what the community likes. With the existence of the PP system, the majority of players play for PP, therefore they play maps which give lots of PP and have skillsets catered towards the skills which give PP. Therefore, there is a large demand for PP maps, so the majority of mappers will make PP maps, and even the mappers who don't explicitly map for PP will have their map design largely influenced by the meta and the skillsets of the majority of players.

that's simply not true... there are mods and ranking criteria for a reason. i could pp map for this map but it won't get ranked because it doesn't meet criteria simply because it doesn't fit the song

That's why you don't see many AR7 or 8 maps these days, and in fact its quite difficult to get a map ranked that goes against the meta.

because ar7 and ar8 are where players still learn the game mechanics, and ar9 is where reading actually begins

That's why it's not valid to say that if the PP system were to go away that what is currently regarded as PP maps will disappear and the players will be left wanting. My argument is, in essence, that the PP system means that players play maps that give PP rather than what they consider good or fun to play. If it were to go, then players would play and compete on maps that they and the community in general thought were good and fun, and therefore the demand and thus supply of maps would shift towards what the community regarded as good and fun. It would also create a more diverse range of maps that focused on larger degree of skillsets, and there wouldn't be a ranking system that arbitrarily chose a few skills to be the most important ones.

and if the community thinks that copypasta jump maps are fun?
before you pounce, i actually enjoy technical maps, but a majority is a majority

And yes I said that Cookiezi is the best player - because he is. No one can match Freedom Dive HDHR, no one can match a plethora of his other plays either. He has amazing aim, the best streaming consistency in the game - not only that, he has great reading, and an exceptional ability to play difficult patterns and has very good acc. You might find players that are better than him in one or two aspects, but there is no one who is so exceptionally good in so many areas. He is in a league of his own.

again, i didn't confirm or deny cookie being good or not.

play more doesn't work if you're playing with a bad habit

see my profile for Tips and FAQ.

B1rd

2,974 posts

Joined December 2013

B1rd 2017-11-26T17:21:30+00:00

Yeah... I'm not gonna respond to that monstrosity of post.

repr1se

482 posts

Joined May 2014

repr1se 2017-11-26T19:40:40+00:00

i'm sorry. i have been enlightened. i now realize that the pp system is completely broken and should be removed as a whole for the benefit of us that enjoy maps that don't get much pp.

jk later bitch

play more doesn't work if you're playing with a bad habit

see my profile for Tips and FAQ.

B1rd

2,974 posts

Joined December 2013

B1rd 2017-11-26T20:38:12+00:00

repr1se wrote:
i'm sorry

Apology accepted.

repr1se wrote:
i have been enlightened.

Are we talking about the Buddhist type of enlightenment? As a Christian I'd advise you not to let your soul be corrupted by heathen religions.

repr1se wrote:
i now realize that

Realise what?

repr1se wrote:
the pp system is completely broken and should be removed as a whole for the benefit of us that enjoy maps that don't get much pp.

Agreed, that was my main point.

repr1se wrote:
jk later

Okay bye.

repr1se wrote:
bitch

Wait, who is this directed at?

Mio Winter

311 posts

Joined April 2016

Topic Starter

Mio Winter 2017-11-30T01:57:40+00:00

Sorry for not replying earlier. I've been busy and then it took a while to compose my reply. I edited the main post to include a section called "Solving the differential popularity problem".

Full Tablet wrote:
Using the average score is not a good indicator of the difficulty of the map, easier maps tend to have relatively few good players attempting plays (mostly people who hunt for score ranks only) and a lot of players who aren't good at the game in general.

Yeah, I tried solving that problem in my edit. Would really like to hear what you think of it since you have experience in trying to create a statistical PP system.

Full Tablet wrote:
Here is an example of another ranking system (for osu!mania) based on statistical data only:
t/329678
It is based on calculating "Difficulty Curves" for each beatmap ("Player Skill" required to achieve certain score in the beatmap), and simultaneously fitting and those curves and the "Player Skill" values each player has (based on the scores they have set). It requires an enormous amount of computing power to calculate, though (here it took several months to calculate the rankings with only ~2000 players, with the time required to calculate increasing approximately quadratically with the amount of players and linearly with the amount of beatmaps).

Wow! Really nice. Have you made a post explaining this algorithm anywhere, or do you have the code on github or something?

repr1se wrote:
there are other things that the pp system leaves out, like unstable rate and overall stability of the player. i'm going to propose some generic equations where specific numbers can be filled in later yada yada yada

this proposition encourages stability in accuracy and increases the amount of pp you get from high OD (like HR):
https://gyazo.com/ab41ade7e2796cafc99fd3c5b2f6276b

this proposition reduces the amount of pp awarded in jump maps, to align with stream and technical maps, as well as reducing the pp of slider-only maps:
https://gyazo.com/96a13706bf510199adfae034a234e471

this proposition increases the pp value of technical maps.
https://gyazo.com/591ea23d810d681155da6e7d93d88b71

do i think these changes have to be implemented? no. but i think it does address the concerns of people whining about the current pp system.

Cool! But I think statistical measures will do better at capturing what "difficulty" is than direct measures because the direct measures will have to deal with the near-infinite complexity of the real world.

repr1se wrote:
k_1 and k_2 are constants that determine the range of PP values.
-how will this be determined?

Just by picking some arbitrary values and judge whether they look about right. You can fiddle with the numbers here.

Momiji wrote:
Mine solution - Do that, then turn ppv2 into SCORE, and then filter ppv2 values through the ppv1 amplification

Huh, interesting. That might work better than using ScoreV2 since PPv2 seems to scale accuracy better. Correct me if I'm wrong, but my impression is that the difference between 99 % and 100 % acc is a lot more on PPv2 than on ScoreV2. This is as it should be because going from 89 % to 90 % accuracy is a lot easier than going from 99 % to 100 %.

ManuelOsuPlayer wrote:
Or a PP system made by players ratting. Where players rate maps under their skills to don't exploit it. For example my lowest toprank PP It's 89pp. I should be able to rate maps what i can get an S to give from 1pp to 88pp.

I really like the idea of basing PP values on player ratings and would like to see it developed further. It might just work.

Although I don't think you should require a certain amount of PPv2 in order to be able to rate maps since then you'd inherit some of the problems that the PPv2 system has. I'd rather suggest that players who have gained an A on a map should be allowed to rate its difficulty. It shouldn't be lower, since you have to be able to play a map decently well in order to truly understand its difficulty. For example, when I was a new player, I couldn't tell the difference in difficulty between Scarlet Rose and, say, Neuronecia, but I now know Scarlet Rose is much harder.

B1rd wrote:
#downwithPP

Meh. I like having a quick and fairly reliable way to determine how good random players on the forums are at aim. It's a pretty good lowest estimate for how good someone is at aim. If someone is 6K PP, I know they're better than me at aim; but if someone is 1K PP, I don't know whether they're better than me at aim.

Husky wrote:
Remove pp.
pls enjoy game

Hmpfh. There's no contradiction between trying to gain PP and enjoying the game. Likely there are very many players who think osu! is fun because they like trying to get more PP than their neighbour.

kricher

98 posts

Joined April 2015

kricher 2018-05-11T21:46:23+00:00

Mio Winter wrote:
[i]

make it (MapRank/N)*(1/ln(sameStuff))?

Mio Winter

311 posts

Joined April 2016

Topic Starter

Mio Winter 2018-05-12T09:10:41+00:00

kricher wrote:
Mio Winter wrote:
[i]

make it (MapRank/N)*(1/ln(sameStuff))?

Hi. : ) This wouldn't work because it would give higher PP for plays on maps where fewer people had already played a map. Consider the PP of #1 on a map with N=1000, vs the PP of #1 on a map with N=10000. With your formula, it would give 10 times more PP to the former play.

(And then there are other reasons I wrote about why the first idea I wrote about wouldn't work.)

Lesbea

13 posts

Joined January 2011

Lesbea 2018-10-14T05:04:20+00:00

I was wondering for a few days if someone designed an adaptive system to fit the subjectivity of difficulty, and here it is. I didn't read it entirely but it looks solid, wow. At the very least, you can hopefully dump this on a programmer and they will be able to do it without asking too many questions.

I need to read again more in-depth to be sure, but it also sounds quite heavy to compute all these things, especially to solve the differential popularity. I believe peppy would gladly try this in a hypothetical universe where he's bored and doesn't have a lot of features to add, bugs to fix, and real life to live, but if the complexity is not lower than quadratic just for adding a new player score, then your ideas are basically moot. In that case, you need to either find a way to make the calculations progressively without breaking the ranking too much, or hope that everything can be refreshed in one go periodically with a good initial estimate for the new submissions. But first, you need to implement a ladder with some dataset and make sure it's both accurate and fast, like someone else said before me. But where is the dataset ? I'm asking Google, he says fuck you.

There is still the osu!api to request a few things like you said, but the top plays are definitely not going to be enough. You can still request the score for a combination of map + player, but it will hit the request limit pretty quickly.

Anyway, I can't give a proper opinion without trying out first.

I haven't completely read the thread either (honestly I don't want to dive into the community too much), but I don't agree with some other propositions. After re-reading I realized I'm repeating some stuff that was already said, but I'm too lazy to remove it now :

@ B1rd
TL;DR :
Even if you don't like rankings, can't you just let us chirp and brag with each other ? It's not like we are affecting how you play, is it ? Hmmm.
Actually there IS a way to even affect solitary snowflakes like you (assuming you still do play offline) but it's at the end of my rant, sorry.

The long post if you want to bother reading monstrosities :
You want to judge without a number, using the community opinion. Well that's great, but this system has been tested for many kinds of competition for centuries, and it has always been clearly biased towards the popularity of a player. I mean, look at how you (and others) fanboy about Cookiezi. I'm not saying he's not the best, he probably is, and he might even have double the PP compared to the #2 with the system proposed in this thread for all I care, but a lot of people are clearly rating him higher than he should be, because lolhype it's Cookiezi ffs bruh. It's difficult for a newer player to become #1 even if he actually had better skills, because the gaming communities are biased towards old players. The same shit happens in Super Smash Bros about Isai because this game has a lot of subjectivity about skills even though the guy is tired (and retired) of the game and the fans are rating him on pure speculation. Bias can also exists (for better or worse) towards black people, overweight people, girls, ugly people, asians, gays, etc. For example I am biased against USA and would severely bring down any player from there just because I hate that 2nd amendment and because their "USA! USA! USA!" chant is fucking annoying. And hey you're forgetting about all the players that aren't Cookiezi or that-super-popular-guy. What's up with MY rank ? Nobody is going to bother, and it's really frustrating to not have a way to see how much I progress. This is also why scoring/speedrunning in video games needed the internet, because you couldn't see who was the best one and how to improve before you could slap your name on a list with a score/time/whatever. We need to compare each other or to show off at some point to make a hobby interesting, it's not simply a matter of "plz enjoy the game in a corner of your little world". Even when you work seriously on some project, you have to deliver it to the world someday, and people will judge it with whatever method is available. You can't just store it in a cardboard and forget it in your attic.
By the way, even if you judge with a single person who decides who goes where without comparing scores, you're still going to need numbers. Cookiezi is the best ? Guess what. He's at the top of the list in your head, "1" is a number, whether you like it or not, and the same goes for the players below.

On another topic, actually we DO have a judging team of humans doing work on osu. Look at the beatmap submission system, and how mappers like Sotarks are garnering attention because they denature the game, yet every kind of bullshit he comes up with automagically goes to ranked. I don't have a clue about his personality and never read a single post from him, but he is clearly mocking someone with https://osu.ppy.sh/s/805224 (it's not like I can FC it but my opinion is still valid right ?). Look at that background, the diff title, the patterns. I'm actually afraid of clicking on the beatmap discussion to read QATs fawning all over him when they are actually the clowns of the story. I don't want to play a casualized game (giving out free PP isn't the right path to take) just because a small club of circle1-2jerkers decided to do memes instead of doing their job. Actually, maybe we should go the shaming route, and straight up balance the PPs according to how much a mapper is giving out to the playerbase. It's even funny how people talk about the mapping meta. The fuck they are talking about, there is nothing of the sort. You can only talk about a meta when there is a winning objective to reach against another strategy. Mapping the best PP/difficulty ratio isn't the objective of a rhythm game. They need to map the music/song correctly instead, and the only "opponent" they have to beat are the players. I'm not saying they should only map weird songs like an aspire, but hopefully you get my point.

I'm sorry for writing so much about the mapping and submission communities even though I hardly interact with them (people will probably say I don't know what I'm talking about because of that), but they still have a lot of effect on the rankings. If the system presented here can provide incentives to increase the quality of beatmaps and give them a proud reason to talk about a "mapping meta" without creating controversial dramas (because that's something humans are incredibly good at), then it's all the better. Still needs testing though.

@ winber1
Deep learning is a fancy tool. You said it was becoming reliable, but it's actually overblown by marketing shenanigans. It doesn't just work for everything, you're almost never going to hear about cases where it failed miserably (for obvious reasons, the successful applications are the ones who make the most buzz), and from a mathematical standpoint we can't prove an upper limit of the error rate.
Well if it "looks like" it works, okay, whatever. But keep in mind that it will only just "look like". You will never be sure. If it breaks down, you will not know why. This cannot be called science, really, but again we are calling "science" a lot of things that are mostly bullshit just because it makes people feel great, so I guess it's better than nothing.

Sign In To Proceed

Don't have an account?

PP!Balance: A proposal for a new PP system

Background

The core idea

Here be maths

Formula explained with an example

Exploitability: PP farmers vs. PP hunter-gatherers

Edit: Solving the differential popularity problem

Testing period before full transition

Questions and Answers

Momiji wrote:

B1rd wrote:

winber1 wrote:

winber1 wrote:

Momiji wrote:

winber1 wrote:

repr1se wrote:

winber1 wrote:

winber1 wrote:

B1rd wrote:

Edgar_Figaro wrote:

B1rd wrote:

B1rd wrote:

B1rd wrote:

B1rd wrote:

Boaring wrote:

B1rd wrote:

B1rd wrote:

B1rd wrote:

chainpullz wrote:

B1rd wrote:

repr1se wrote:

B1rd wrote:

repr1se wrote:

repr1se wrote:

repr1se wrote:

repr1se wrote:

repr1se wrote:

repr1se wrote:

Full Tablet wrote:

Full Tablet wrote:

repr1se wrote:

repr1se wrote:

Momiji wrote:

ManuelOsuPlayer wrote:

B1rd wrote:

Husky wrote:

Mio Winter wrote:

kricher wrote:

Mio Winter wrote:

New reply