Hello! Please note that the following is for just the osu! game mode.
osu!taiko is currently running a similar trial, which may have its own post in the future, but discussion should be kept around just osu!standard for this thread.
Overview
Previously, NAT have been handling BN applications and evaluations themselves. They would pursue discussion and be the final decision makers. BNs would be able to mock evaluations on applications for NAT to have a better idea of who would be good NAT candidates in the future, but their votes did not have influence on final decisions.
When NAT did evaluations solely on their own, each evaluation would be randomly assigned 3 NAT on the BN website. NAT would do these individually initially to avoid confirmation bias. When an evaluation card reaches 3 NAT evaluations, it would move to the group discussion stage where NAT would discuss their decisions and write feedback. A discord bot would notify them of assignments and upcoming due dates. NAT could choose to add a random amount of BN evaluators, but the BN evaluations were not required for cards to move to group stage. BNs would not be able to see evaluation cards in the group stage as well.
How this changed during the trial
Starting in mid May we started a trial of a slightly different system. In this trial the osu! standard NAT were joined by Beatmap Nominators in evaluating current BNs and BN applications. This gave BNs full access to evaluating both applications and doing current BN evaluations alongside the NAT. They got to be more involved in the decision making process, participate in following discussion hosted in NAT channels, and write feedback sent to the users evaluated, all alongside the NAT members. Recent BN applicants may have noticed this in seeing purple names on their BN application feedback.
Evaluations would be assigned 2 BNs and 2 NAT, and require 4 evaluations to move to the group stage.
Who participated
Trial participants were selected from BNs who volunteered their interest.
The first wave of the trial was a smaller group consisting of Uberzolik, Mafumafu, Andrea, Nana Abe, Riana, Cheri, Sparhten, Firika, fieryrage and Bibbity Bill (10 users). It ran from mid May to mid June.
The second wave of the trial was a bigger pool of users and over a longer span of time, running from mid June to mid August. This group consisted of VINXIS, StarCastler, Cris-, Petal, UberFazz, realy0_, AJT, rosario wknd, -Keitaro, Elayue, Kudosu, NexusQI, pimp, NeKroMan4ik, Mirash, Mimari, Stixy and Morrighan (18 users).
Trial Feedback
After each wave ended, we surveyed the BNs for their thoughts on the trial. At the end of the trial we additionally surveyed the NAT members for their thoughts as well. The purpose of the trial is to explore where we can go for handling main osu! game mode's BN system in the future, which this thread is to help further discuss.
Below is a summary of the survey response takeaways.
BN Survey Summary
NAT Survey Summary
Document on additional Trial NAT Observations as of wave 1 written by yaspo
Observed problems
We saw several issues throughout the trial cycle, such as:
Feedback for the trial was generally positive, though many BNs in wave 2 felt they would do it occasionally or in shorter terms.
Most people surveyed felt there should be restrictions in place for who can participate, such as having a good record and being a full BN for the previous 3-6 months (time varied per response). Others felt participating in this format was fun, but not fully a good idea to be on equal standing with the NAT due to it being harder to track bias and differing standards.
Originally, we wanted to explore BNs for the osu! gamemode fully evaluating themselves with a few key NAT members helping to provide guidance and serve as tiebreakers. However, when looking at survey results we may push in a different direction and would like to discuss this further before making any final changes or decisions.
Discussion
Currently we know this for sure:
How do you think the future system should look based off the trial information and survey results?
osu!taiko is currently running a similar trial, which may have its own post in the future, but discussion should be kept around just osu!standard for this thread.
Overview
Previously, NAT have been handling BN applications and evaluations themselves. They would pursue discussion and be the final decision makers. BNs would be able to mock evaluations on applications for NAT to have a better idea of who would be good NAT candidates in the future, but their votes did not have influence on final decisions.
When NAT did evaluations solely on their own, each evaluation would be randomly assigned 3 NAT on the BN website. NAT would do these individually initially to avoid confirmation bias. When an evaluation card reaches 3 NAT evaluations, it would move to the group discussion stage where NAT would discuss their decisions and write feedback. A discord bot would notify them of assignments and upcoming due dates. NAT could choose to add a random amount of BN evaluators, but the BN evaluations were not required for cards to move to group stage. BNs would not be able to see evaluation cards in the group stage as well.
How this changed during the trial
Starting in mid May we started a trial of a slightly different system. In this trial the osu! standard NAT were joined by Beatmap Nominators in evaluating current BNs and BN applications. This gave BNs full access to evaluating both applications and doing current BN evaluations alongside the NAT. They got to be more involved in the decision making process, participate in following discussion hosted in NAT channels, and write feedback sent to the users evaluated, all alongside the NAT members. Recent BN applicants may have noticed this in seeing purple names on their BN application feedback.
Evaluations would be assigned 2 BNs and 2 NAT, and require 4 evaluations to move to the group stage.
Who participated
Trial participants were selected from BNs who volunteered their interest.
The first wave of the trial was a smaller group consisting of Uberzolik, Mafumafu, Andrea, Nana Abe, Riana, Cheri, Sparhten, Firika, fieryrage and Bibbity Bill (10 users). It ran from mid May to mid June.
The second wave of the trial was a bigger pool of users and over a longer span of time, running from mid June to mid August. This group consisted of VINXIS, StarCastler, Cris-, Petal, UberFazz, realy0_, AJT, rosario wknd, -Keitaro, Elayue, Kudosu, NexusQI, pimp, NeKroMan4ik, Mirash, Mimari, Stixy and Morrighan (18 users).
Trial Feedback
After each wave ended, we surveyed the BNs for their thoughts on the trial. At the end of the trial we additionally surveyed the NAT members for their thoughts as well. The purpose of the trial is to explore where we can go for handling main osu! game mode's BN system in the future, which this thread is to help further discuss.
Below is a summary of the survey response takeaways.
BN Survey Summary
Summary
Would you continue doing evaluations if you were given the option?
- Responses were overwhelmingly "yes", with only a handful saying no or not sure.
- Difficult to communicate, especially in cases where opinions were split and they had to come to a conclusion. In larger groups this decision making lacked a clear process.
- Fun and gives new insight into the process, is a cool way to contribute.
- Feels better than previous mock BN evaluations, opinions mattering made it more interesting and motivating.
- It is hard to come to final decisions, and would need NAT guidance.
- Weigh the results so it is not solely BN decisions deciding who is accepted, rejected, passing current evals, etc.
- BNs already have enough to do with main BN work, and evaluations would only be a distraction from their main priority. This would lead to either their BN activity or evaluation participation dropping drastically.
- BN self management may not work, but it can be a good replacement for previous mock evals for finding new NAT members.
- This depends on where the system goes
- NO, have set restrictions such as a good BN record for x length of time.
- Remove those who perform poorly from doing BN evaluations in the future.
NAT Survey Summary
Summary
Most points from the BN summary were in the NAT responses as well, so this summary will focus on points specific to the NAT survey.
What are your thoughts on the trial and how was your experience with it?
What are your thoughts on the trial and how was your experience with it?
- Opinions are fairly split between being satisfied with how it did better than expectations, feeling like it's not realistically any different, and worried that it will have more issues to deal with among changing standards.
- Cycle frequently
- Good for BNs to showcase other skills related to NAT work.
- Fewer users at a time, such as 5-7
- Not all BNs, require them to be BN for ~6 months.
- Have a good record without warnings for behavior, quality issues, etc.
- No recent issues in their evaluations.
- Otherwise, give those interested a shot. Don't actively look for participants.
Document on additional Trial NAT Observations as of wave 1 written by yaspo
Observed problems
We saw several issues throughout the trial cycle, such as:
- BNs using their new abilities to leak evaluation and application results before they were ready or finished.
- One BN participating in the trial being placed on probation mid way through, and having to remove them from the trial because of this.
- How we can structure this to where BNs do not see their own current BN evaluations in real time, as this can lead to panic if someone knows for sure their eval is going poorly, and general awkwardness from that visibility mentioned by several BNs throughout the trial.
- How to transition between waves, wave 2 participants had difficulty finishing wave 1 work that was not quite finished, due to not having individually evaluated the cards themselves.
- In both waves we saw carry users who did the majority of the work while some users faded out of active participation.
- In wave 1 this was partly due to all evals being done so quickly that participants that had less free time simply did not get the chance to do so.
- In wave 2, we implemented a system where people unassigned could not do evaluations until they were close to their due date. This was to give assigned users a fair chance to do their given evaluations.
- However, the wave 2 change instead lead to many evaluations becoming overdue, due to no active system to track when it would be open to those who aren't assigned to the evaluations.
- Fluctuating standards between evaluators. NAT as a smaller group work closer together and have overall similar standards with different insights, but in larger groups among the BN standards for evaluations varied more wildly. Some BNs were much stricter than NAT, while others were far more lax. In a longer term, this has risks of BN application results being more inconsistent and RNG based.
Feedback for the trial was generally positive, though many BNs in wave 2 felt they would do it occasionally or in shorter terms.
Most people surveyed felt there should be restrictions in place for who can participate, such as having a good record and being a full BN for the previous 3-6 months (time varied per response). Others felt participating in this format was fun, but not fully a good idea to be on equal standing with the NAT due to it being harder to track bias and differing standards.
Originally, we wanted to explore BNs for the osu! gamemode fully evaluating themselves with a few key NAT members helping to provide guidance and serve as tiebreakers. However, when looking at survey results we may push in a different direction and would like to discuss this further before making any final changes or decisions.
Discussion
Currently we know this for sure:
- If implemented permanently, it would be on a cycle-based system to help combat motivation dying out and negatively affecting turnaround time. Wave 2 was definitely too long, so somewhere around a month would be ideal.
- We will also implement some form of limits on who can participate, but the exact details for that have not yet been decided.
How do you think the future system should look based off the trial information and survey results?