I made some changes to the algorithm, inspired by this post: p/4383854
The algorithm fits the data (scores obtained by players) to logistic curves, where the parameters to fit are Player Skill, Beatmap Difficulty for 900K score, and Steepness of the difficulty curve for beatmaps.
The predicted score for a play is: .
Where P is the player skill, B is the beatmap difficulty (for 900K score), and S is the steepness parameter of the difficulty curve.
For example, 2 different maps that have the same difficulty at 900K, but different steepness:
The orange curve represents the difficulty curve of a map with high steepness, while the blue one has lower steepness.
The regression minimizes the sum of the square of the errors of the predicted scores compared to the data.
Here are results for ranked 6K maps: https://www.dropbox.com/s/vyoi1r86m9r8t ... .xlsx?dl=0
Take beatmap difficulty results with few scores with a grain of salt (specially ones with only 1 score to base the calculation from, those ones use a default steepness parameter instead of one calculated).
For the player rankings, there is also a "Performance" value. This value is calcutated based on the associated difficulty each play the player has, with a score penalty based on map length (since it's more likely to have fluke plays on shorter maps), and reduced weighting for beatmaps that had their difficulty estimated based on few scores (since they are more likely to not be accurate). The "Player Skill" is the value used in the beatmap difficulty estimation, and is more indicative of the average performance of the player in the plays he has had.
For running the algorithms for other keycounts, I would need to select players to base the calculations on (I can't use a very large amount, since the algorithm is expensive in RAM and CPU use). Ideally, the players should have a big amount of plays, and have a consistent performance (not having many scores with a performance below their current level of play, for example, a player that has improved a lot over time, but hasn't improved their old scores), also, the players should represent a wide range of skill levels. Once the beatmap difficulty values are calculated, adding more players to the ranking is relatively simple (but the score retrieval using the osu! API is still quite slow).
The algorithm fits the data (scores obtained by players) to logistic curves, where the parameters to fit are Player Skill, Beatmap Difficulty for 900K score, and Steepness of the difficulty curve for beatmaps.
The predicted score for a play is: .
Where P is the player skill, B is the beatmap difficulty (for 900K score), and S is the steepness parameter of the difficulty curve.
For example, 2 different maps that have the same difficulty at 900K, but different steepness:
The orange curve represents the difficulty curve of a map with high steepness, while the blue one has lower steepness.
The regression minimizes the sum of the square of the errors of the predicted scores compared to the data.
Here are results for ranked 6K maps: https://www.dropbox.com/s/vyoi1r86m9r8t ... .xlsx?dl=0
Take beatmap difficulty results with few scores with a grain of salt (specially ones with only 1 score to base the calculation from, those ones use a default steepness parameter instead of one calculated).
For the player rankings, there is also a "Performance" value. This value is calcutated based on the associated difficulty each play the player has, with a score penalty based on map length (since it's more likely to have fluke plays on shorter maps), and reduced weighting for beatmaps that had their difficulty estimated based on few scores (since they are more likely to not be accurate). The "Player Skill" is the value used in the beatmap difficulty estimation, and is more indicative of the average performance of the player in the plays he has had.
For running the algorithms for other keycounts, I would need to select players to base the calculations on (I can't use a very large amount, since the algorithm is expensive in RAM and CPU use). Ideally, the players should have a big amount of plays, and have a consistent performance (not having many scores with a performance below their current level of play, for example, a player that has improved a lot over time, but hasn't improved their old scores), also, the players should represent a wide range of skill levels. Once the beatmap difficulty values are calculated, adding more players to the ranking is relatively simple (but the score retrieval using the osu! API is still quite slow).