Statistic approach to Player Skill and Beatmap Difficulty

posted
Total Posts
58
show more
Topic Starter
Full Tablet
Since now each player can have several scores per map stored on osu! servers, I will delay the next update so players have time to get more scores (I expect several players will start getting DT scores on maps, making the ratings on DT versions of maps more accurate overall).
abraker
I wonder why you haven't considered Ripple scores yet
Topic Starter
Full Tablet
Here is updated results for 4K maps and players:

https://docs.google.com/spreadsheets/d/ ... sp=sharing

Next update will consider 7K maps and players.
snoverpk
nice update but all of the scores are from february
Topic Starter
Full Tablet

snoverpk wrote:

nice update but all of the scores are from february
It took a bit more than a month to retrieve the scores from the osu! servers using the API. The calculation after retrieving the scores then took several months (next updates would take less, considering several optimizations done to the algorithm meanwhile).
Minisora
I'm too horrible at mania to be included in the list :)

Nice list though, I give an A+ for the computer making the calculations :P
Topic Starter
Full Tablet
https://docs.google.com/spreadsheets/d/ ... sp=sharing

Here are results for 9K maps and players. Results for 7K were delayed because of complications while retrieving the score data with the API (there was a bug in Mathematica 11.1 that made some API calls return incorrect data).
Topic Starter
Full Tablet
Added results for 7K players and maps.

https://docs.google.com/spreadsheets/d/ ... sp=sharing
abraker
I have been wondering, is there any correlation between the length of the map and the number of people who get a higher score when comparing maps of similar SR (tom stars)?
Topic Starter
Full Tablet

abraker wrote:

I have been wondering, is there any correlation between the length of the map and the number of people who get a higher score when comparing maps of similar SR (tom stars)?


Here are some graphs of the number of notes in beatmaps, and ratio of plays that pass certain score milestones (800k, 900k, 990k, 1M) in the scores in the data, for several star rating ranges.

There is a tendency for a decrease in the amount of passes when the number of notes increases, but the correlation is not strong.

The correlation coefficients of each linear regression is rather low, with r of around 0.35 in the case of [0.8,1.2] star rating with 990K and 1M milestones, and for the case of [1.8, 2.2] star rating with 990K milestone.

For other score milestones and other star rating ranges, the correlation is even lower.
Topic Starter
Full Tablet
https://drive.google.com/file/d/1KFFVOM_YsnRuvfSUWr4M2_KFZtoyyghH/view?usp=sharing

New update for 4K and 7K beatmap and players. Using newer scores and fixing a typo in the algorithm that made results slightly different from intended.
abraker
Blastix Riotz +DT seems a little weird for 500k and 600k, double difficulty. I am going to guess not enough data points.

This makes me think, since you have been working with the data for sometime now, how many data points do you typically need for the results to be accurate or at least make sense?
Topic Starter
Full Tablet

abraker wrote:

Blastix Riotz +DT seems a little weird for 500k and 600k, double difficulty. I am going to guess not enough data points.

Blue dots are the scores in the data (score achieved vs average play skill of the player)

The data in Blastix Riots +DT is lacking. By what we can see in the data, it seems only the best players can achieve more than 500k score in the beatmap, but there aren't many scores set by the best players, so we can't be certain that the difficulty estimation is accurate. The difficulty rating for 700k score or better is very extrapolated, so it may not be accurate at all. The best player that has set a score in the beatmap is [Crz]Player (which is rated as the player who sets good scores in the most difficult maps, but it's not the player who consistently sets the best scores), and he was still far from getting a 600k score or better.

The scale of difficulty is set so the skill of the players in the data follow a gamma distribution with mean 3 and standard deviation 1.5. So a value of 3 is something the "average" of the players in the data is expected to be able to do (which is still quite an achievement since the data is mostly composed of the best players in the game), a value of 5 is roughly something only the top 10% is expected to be able to, while 10.71 is something that goes a bit beyond what any player could do consistently.

The scale is not actually something of importance regarding calculations. If you can define what "double the amount of difficulty" means, maybe I could set the scale according to that definition.

abraker wrote:

This makes me think, since you have been working with the data for sometime now, how many data points do you typically need for the results to be accurate or at least make sense?
The more scores from players that struggle for a certain goal, the more confident we can be the difficulty estimation for that goal is accurate.

Usually, about 20 scores in the same score range is the bare minimum to be confident about the estimation, but having above 200 or even 1000 plays is much better. Some popular maps have 1000+ scores in total in the data, but still have few scores in some score ranges (for example, despite AiAe [MX] being the most popular map, few players in the data have less than 700k score on it).
Topic Starter
Full Tablet
New update for 4K with scores retrieved mostly during November 2018. This one took the scores from 4000 players, so it took considerably longer to retrieve the scores and calculate the results.

https://docs.google.com/spreadsheets/d/1njYWZSQjV6D8EHrCnpnzRbQycH0BG7C-DWy2--T8Zjw/edit?usp=sharing

Please sign in to reply.

New reply