PM me if you want someone to be included in the next updates of this.
Based on scores on ranked beatmaps, the results here simultaneously estimate the difficulty of beatmaps (difficulty of achieving certain scores) and player skill (ability to score high on beatmaps).
A beatmap is rated high when there are low scores by players who are rated high in skill, whereas players are rated high when they have high scores in beatmaps that are rated high. The method is completely statistical, and doesn't look into the content of beatmaps (except for amount of objects for the expected amount of variance).
Terms:
Tom Stars: Star Rating of a beatmap (based on the algorithm mainly designed by Tom94)
"X" Diff: The estimated difficulty of achieving at least X*1000 of score in a map with 1000 retries. They are measured in a scale that resembles star rating, with the 900K Difficulty giving the closest values to Tom Stars.
Score Count: Amount of scores retrieved for the beatmap or player in the online leaderboards.
Average Play Skill: The average difficulty of all the scores in the records set by the player. Not a very meaningful measure, since it might consider scores made when the player was at a lower skill level.
Peak Play Skill: Similar to Average Skill, but the best scores are weighted much higher compared to the rest. It's very sensitive to outliers, so it is not a very robust indicator of skill.
Accuracy Performance: Indicator of skill that works similarly to pp (setting a sub-par score doesn't lower the value. The best score has a weight of 100%, while the 2nd one a weight of 95%, etc.). The scale is the same as the ones for XK Difficulty, so having a lot of scores of a certain difficulty makes the Accuracy Performance converge to that difficulty value.
Technical Performance: Similar to Accuracy Performance, but it doesn't award you for getting score past 900,000 (For example, setting a score of 960,000 awards the same performance as setting a score of 900,000). This estimates the ability of the player to set good scores in difficult maps, rather than awarding players for having very good accuracy in easier maps.
xK ppv2: The pp achieved from all ranked 4K maps the player has played, not considering the bonus pp from setting a lot of scores.
Newest Version (2020/02/02), 7K Only
https://drive.google.com/file/d/1vmWpPannfXiR3xTYoypbplV8xsciNPtB/view?usp=sharing
Old Version (2019/02/04), 4K Only
https://docs.google.com/spreadsheets/d/1njYWZSQjV6D8EHrCnpnzRbQycH0BG7C-DWy2--T8Zjw/edit?usp=sharing
Old Version (2018/02/16), 7K/9K
https://docs.google.com/spreadsheets/d/16ik3TElUYhzTkm6U6QdA_J0owiQJJ_Wx1yjmYNCJ9jk/edit?usp=sharing
Different keymodes have different scaling, they aren't meant to be directly compared one with another.
What are your opinions of the results?
This is not meant to replace the current beatmap difficulty algorithm used for pp, since it has limitations of purely statistical approaches. It might be used to calibrate beatmap difficulty algorithms based on beatmap analysis, though.
Edit: 2020/02/02: Updated results for 7K.
Based on scores on ranked beatmaps, the results here simultaneously estimate the difficulty of beatmaps (difficulty of achieving certain scores) and player skill (ability to score high on beatmaps).
A beatmap is rated high when there are low scores by players who are rated high in skill, whereas players are rated high when they have high scores in beatmaps that are rated high. The method is completely statistical, and doesn't look into the content of beatmaps (except for amount of objects for the expected amount of variance).
Terms:
Tom Stars: Star Rating of a beatmap (based on the algorithm mainly designed by Tom94)
"X" Diff: The estimated difficulty of achieving at least X*1000 of score in a map with 1000 retries. They are measured in a scale that resembles star rating, with the 900K Difficulty giving the closest values to Tom Stars.
Score Count: Amount of scores retrieved for the beatmap or player in the online leaderboards.
Average Play Skill: The average difficulty of all the scores in the records set by the player. Not a very meaningful measure, since it might consider scores made when the player was at a lower skill level.
Peak Play Skill: Similar to Average Skill, but the best scores are weighted much higher compared to the rest. It's very sensitive to outliers, so it is not a very robust indicator of skill.
Accuracy Performance: Indicator of skill that works similarly to pp (setting a sub-par score doesn't lower the value. The best score has a weight of 100%, while the 2nd one a weight of 95%, etc.). The scale is the same as the ones for XK Difficulty, so having a lot of scores of a certain difficulty makes the Accuracy Performance converge to that difficulty value.
Technical Performance: Similar to Accuracy Performance, but it doesn't award you for getting score past 900,000 (For example, setting a score of 960,000 awards the same performance as setting a score of 900,000). This estimates the ability of the player to set good scores in difficult maps, rather than awarding players for having very good accuracy in easier maps.
xK ppv2: The pp achieved from all ranked 4K maps the player has played, not considering the bonus pp from setting a lot of scores.
Newest Version (2020/02/02), 7K Only
https://drive.google.com/file/d/1vmWpPannfXiR3xTYoypbplV8xsciNPtB/view?usp=sharing
Old Version (2019/02/04), 4K Only
https://docs.google.com/spreadsheets/d/1njYWZSQjV6D8EHrCnpnzRbQycH0BG7C-DWy2--T8Zjw/edit?usp=sharing
Old Version (2018/02/16), 7K/9K
https://docs.google.com/spreadsheets/d/16ik3TElUYhzTkm6U6QdA_J0owiQJJ_Wx1yjmYNCJ9jk/edit?usp=sharing
Different keymodes have different scaling, they aren't meant to be directly compared one with another.
What are your opinions of the results?
This is not meant to replace the current beatmap difficulty algorithm used for pp, since it has limitations of purely statistical approaches. It might be used to calibrate beatmap difficulty algorithms based on beatmap analysis, though.
Edit: 2020/02/02: Updated results for 7K.