The BN Evaluators Trial Review and Discussion

Noffy

Nomination Assessment Team

Joined April 2012

Topic Starter

Noffy 2021-08-19T16:36:39+00:00

Hello! Please note that the following is for just the osu! game mode.

osu!taiko is currently running a similar trial, which may have its own post in the future, but discussion should be kept around just osu!standard for this thread.

Overview

Previously, NAT have been handling BN applications and evaluations themselves. They would pursue discussion and be the final decision makers. BNs would be able to mock evaluations on applications for NAT to have a better idea of who would be good NAT candidates in the future, but their votes did not have influence on final decisions.

When NAT did evaluations solely on their own, each evaluation would be randomly assigned 3 NAT on the BN website. NAT would do these individually initially to avoid confirmation bias. When an evaluation card reaches 3 NAT evaluations, it would move to the group discussion stage where NAT would discuss their decisions and write feedback. A discord bot would notify them of assignments and upcoming due dates. NAT could choose to add a random amount of BN evaluators, but the BN evaluations were not required for cards to move to group stage. BNs would not be able to see evaluation cards in the group stage as well.

How this changed during the trial

Starting in mid May we started a trial of a slightly different system. In this trial the osu! standard NAT were joined by Beatmap Nominators in evaluating current BNs and BN applications. This gave BNs full access to evaluating both applications and doing current BN evaluations alongside the NAT. They got to be more involved in the decision making process, participate in following discussion hosted in NAT channels, and write feedback sent to the users evaluated, all alongside the NAT members. Recent BN applicants may have noticed this in seeing purple names on their BN application feedback.

Evaluations would be assigned 2 BNs and 2 NAT, and require 4 evaluations to move to the group stage.

Who participated

Trial participants were selected from BNs who volunteered their interest.

The first wave of the trial was a smaller group consisting of Uberzolik, Mafumafu, Andrea, Nana Abe, Riana, Cheri, Sparhten, Firika, fieryrage and Bibbity Bill (10 users). It ran from mid May to mid June.

The second wave of the trial was a bigger pool of users and over a longer span of time, running from mid June to mid August. This group consisted of VINXIS, StarCastler, Cris-, Petal, UberFazz, realy0_, AJT, rosario wknd, -Keitaro, Elayue, Kudosu, NexusQI, pimp, NeKroMan4ik, Mirash, Mimari, Stixy and Morrighan (18 users).

Trial Feedback

After each wave ended, we surveyed the BNs for their thoughts on the trial. At the end of the trial we additionally surveyed the NAT members for their thoughts as well. The purpose of the trial is to explore where we can go for handling main osu! game mode's BN system in the future, which this thread is to help further discuss.

Below is a summary of the survey response takeaways.

BN Survey Summary

Summary

Would you continue doing evaluations if you were given the option?

Responses were overwhelmingly "yes", with only a handful saying no or not sure.

What are your thoughts on the trial and how was your experience with it?

Difficult to communicate, especially in cases where opinions were split and they had to come to a conclusion. In larger groups this decision making lacked a clear process.
Fun and gives new insight into the process, is a cool way to contribute.
Feels better than previous mock BN evaluations, opinions mattering made it more interesting and motivating.

Do you think we should keep running this after the trial? Is there anything you would change?

It is hard to come to final decisions, and would need NAT guidance.
Weigh the results so it is not solely BN decisions deciding who is accepted, rejected, passing current evals, etc.
BNs already have enough to do with main BN work, and evaluations would only be a distraction from their main priority. This would lead to either their BN activity or evaluation participation dropping drastically.
BN self management may not work, but it can be a good replacement for previous mock evals for finding new NAT members.

Do you think any BN should be given the option to do evaluations? If not, where do you draw the line?

This depends on where the system goes
NO, have set restrictions such as a good BN record for x length of time.
Remove those who perform poorly from doing BN evaluations in the future.

NAT Survey Summary

Summary

Most points from the BN summary were in the NAT responses as well, so this summary will focus on points specific to the NAT survey.

What are your thoughts on the trial and how was your experience with it?

Opinions are fairly split between being satisfied with how it did better than expectations, feeling like it's not realistically any different, and worried that it will have more issues to deal with among changing standards.

Do you think we should keep running this after the trial? Is there anything you would change?

Cycle frequently
Good for BNs to showcase other skills related to NAT work.
Fewer users at a time, such as 5-7

Do you think any BN should be given the option to do evaluations? If not, where do you draw the line?

Not all BNs, require them to be BN for ~6 months.
Have a good record without warnings for behavior, quality issues, etc.
No recent issues in their evaluations.
Otherwise, give those interested a shot. Don't actively look for participants.

Document on additional Trial NAT Observations as of wave 1 written by yaspo

Observed problems

We saw several issues throughout the trial cycle, such as:

BNs using their new abilities to leak evaluation and application results before they were ready or finished.
One BN participating in the trial being placed on probation mid way through, and having to remove them from the trial because of this.
How we can structure this to where BNs do not see their own current BN evaluations in real time, as this can lead to panic if someone knows for sure their eval is going poorly, and general awkwardness from that visibility mentioned by several BNs throughout the trial.
How to transition between waves, wave 2 participants had difficulty finishing wave 1 work that was not quite finished, due to not having individually evaluated the cards themselves.
In both waves we saw carry users who did the majority of the work while some users faded out of active participation.
1. In wave 1 this was partly due to all evals being done so quickly that participants that had less free time simply did not get the chance to do so.
2. In wave 2, we implemented a system where people unassigned could not do evaluations until they were close to their due date. This was to give assigned users a fair chance to do their given evaluations.
1. However, the wave 2 change instead lead to many evaluations becoming overdue, due to no active system to track when it would be open to those who aren't assigned to the evaluations.
Fluctuating standards between evaluators. NAT as a smaller group work closer together and have overall similar standards with different insights, but in larger groups among the BN standards for evaluations varied more wildly. Some BNs were much stricter than NAT, while others were far more lax. In a longer term, this has risks of BN application results being more inconsistent and RNG based.

Where we go next

Feedback for the trial was generally positive, though many BNs in wave 2 felt they would do it occasionally or in shorter terms.

Most people surveyed felt there should be restrictions in place for who can participate, such as having a good record and being a full BN for the previous 3-6 months (time varied per response). Others felt participating in this format was fun, but not fully a good idea to be on equal standing with the NAT due to it being harder to track bias and differing standards.

Originally, we wanted to explore BNs for the osu! gamemode fully evaluating themselves with a few key NAT members helping to provide guidance and serve as tiebreakers. However, when looking at survey results we may push in a different direction and would like to discuss this further before making any final changes or decisions.

Discussion

Currently we know this for sure:

If implemented permanently, it would be on a cycle-based system to help combat motivation dying out and negatively affecting turnaround time. Wave 2 was definitely too long, so somewhere around a month would be ideal.
We will also implement some form of limits on who can participate, but the exact details for that have not yet been decided.

This leaves figuring out the details for the new implementation of what we trialled, and figuring out solutions to the problems faced during the trial if they are determined to be solvable.

How do you think the future system should look based off the trial information and survey results?

Last edited by Noffy 2021-08-19T20:29:48+00:00, edited 14 times in total.

Naxess

Nomination Assessment Team

390 posts

Joined March 2016

Naxess 2021-08-19T20:30:20+00:00

to where BNs do not see their own current BN evaluations

think picking participants that don't have evals upcoming in that period would fix this, that's how the recent taiko bn evalers cycle was done anyway

Last edited by Naxess 2021-08-19T20:31:47+00:00, edited 1 time in total.

Basensorex

Beatmap Nominator

291 posts

Joined January 2018

Basensorex 2021-08-19T20:45:22+00:00

>In a longer term, this has risks of BN application results being more inconsistent and RNG based.

null point considering it wasnt much different under the usual system, especially a few months ago

ikin5050

583 posts

Joined February 2014

ikin5050 2021-08-19T20:56:02+00:00

If you choose to implement this in cycles of a month length the problem of people having to pick up half finished work from the previous cycle needs to be addressed.

Could be done instead with a month cycle of evaluating and then agreeing to standby for a few extra weeks in case discussions need to be had about users who were evaluated?

clayton

1,842 posts

Joined November 2013

clayton 2021-08-19T21:05:41+00:00

cool stuff, it looks promising that most of the observed problems are things that can be worked out system/dev-side and u kinda already know what to fix. also promising that each wave collected helpful feedback and you're iterating on these ideas quickly. I don't have much to add to what was already said so I'll be keeping my eyes out for a report after wave 3 :^)

Naxess wrote:
to where BNs do not see their own current BN evaluations
think picking participants that don't have evals upcoming in that period would fix this, that's how the recent taiko bn evalers cycle was done anyway

is it difficult to hide them on the website? or does "see" mean like get word of discussion somewhere?

-White

315 posts

Joined February 2020

-White 2021-08-19T21:05:57+00:00

> In a longer term, this has risks of BN application results being more inconsistent and RNG based.

I'm very concerned about this one, especially since NAT standards alone were never consistent. I'd like to see some system implemented to increase consistency if possible.

Noffy

Nomination Assessment Team

1,659 posts

Joined April 2012

Topic Starter

Noffy 2021-08-19T21:15:14+00:00

clayton wrote:
cool stuff, it looks promising that most of the observed problems are things that can be worked out system/dev-side and u kinda already know what to fix. also promising that each wave collected helpful feedback and you're iterating on these ideas quickly. I don't have much to add to what was already said so I'll be keeping my eyes out for a report after wave 3 :^)

Naxess wrote:
to where BNs do not see their own current BN evaluations
think picking participants that don't have evals upcoming in that period would fix this, that's how the recent taiko bn evalers cycle was done anyway
is it difficult to hide them on the website? or does "see" mean like get word of discussion somewhere?

in group stage they're discussed in a central discord channel which can't really be hidden unless the people being evaluated just aren't in the wave. moving it all to the website decreases visibility for participants (new evals and stuff moving to group is also notified on the discord for visibility) and is a lot of dev work for recreating something that already exists.

-White wrote:
> In a longer term, this has risks of BN application results being more inconsistent and RNG based.

I'm very concerned about this one, especially since NAT standards alone were never consistent. I'd like to see some system implemented to increase consistency if possible.

yeah this needs a documented guidelines to be set up to go off of for stuff which is currently learned by doing, see "telephone" on yaspo's document too.

ikin5050 wrote:
If you choose to implement this in cycles of a month length the problem of people having to pick up half finished work from the previous cycle needs to be addressed.

Could be done instead with a month cycle of evaluating and then agreeing to standby for a few extra weeks in case discussions need to be had about users who were evaluated?

That's a good idea, will keep note of that. Either that or working on how scheduling is set up so that nothing is due in the last week to ensure all due/overdue cards are finished before a wave ends.

Last edited by Noffy 2021-08-19T21:16:25+00:00, edited 1 time in total.

-White

315 posts

Joined February 2020

-White 2021-08-19T21:51:28+00:00

Noffy wrote:
working on how scheduling is set up so that nothing is due in the last week to ensure all due/overdue cards are finished before a wave ends.

I think this is nice. I don't think the cycles have to be black/white, there can be a smooth transition between them, where at times there might be 2 different eval teams active simultaneously, but only one would be taking the new requests. Could just put one or two NAT in charge of each team so that they're independently managed or something.

VINXIS

Featured Artist

3,161 posts

Joined April 2014

VINXIS 2021-08-19T22:30:29+00:00

My main issue was and still is mostly the part where theres no direct line of communication with the applicant/evaluated which also indirectly result in longer waiting times for reappliers to months + not able to communicate/work with people we would be bringing into the group that we would inevitably work with nominating sets at some point

I don't think this improves/worsens the process for applicants/the evaluated either way aside for faster application results sometimes (which also probably wont be consistent either), but otherwise I think expanding it to BNs from just NATs seemed like a decent idea since the start and seemed fine in its functionality in trial from the evaluators' side so

Last edited by VINXIS 2021-08-19T22:36:08+00:00, edited 4 times in total.

-White

315 posts

Joined February 2020

-White 2021-08-19T22:37:29+00:00

^ Kinda curious what sort of reason evaluators would have to communicate with the applicant, and how that would speed up results?

VINXIS

Featured Artist

3,161 posts

Joined April 2014

VINXIS 2021-08-19T22:45:58+00:00

When applying and failing the application, u are given a wall of text under a Feedback section on the website, where evaluators attempt to summarize what you should improve on based off of the notes written by the evaluators, and then the applicant is told to fix these and reapply later after.

In comparison to having more direct contact with the applicant, where for example u have the discussion/conversation (which Currently happens by evaluators themselves writing notes and after all 4+ evaluators finished writing notes and choosing if they pass/neutral/fail the mans) with the applicant at the time as well, and having some form of like "followups" for anyone that "doesnt pass at the time" in my head at least seems far more easier to convey more valuable information to applicants (and to ourselves)/get issues resolved faster/streamline the process.

Mainly I just think the discussions that are happening in the current system are missing the applicant currently, and are more valuable than the feedback wall that is sent in comparison, and if it is more valuable, would be easier/faster to fix what those issues are

Last edited by VINXIS 2021-08-19T22:46:49+00:00, edited 2 times in total.

-White

315 posts

Joined February 2020

-White 2021-08-19T23:09:01+00:00

Yes I actually fully agree with vinxis, can we actually do that? I think all of that would improve the applicant experience so much holy shit

Naxess

Nomination Assessment Team

390 posts

Joined March 2016

Naxess 2021-08-19T23:12:47+00:00

VINXIS wrote:
I just think the discussions that are happening in the current system are missing the applicant currently

would be careful about including the applicant too early on; current feedback wall is basically eval notes cleaned up from unnecessary/poorly-worded/poor advice, and having the applicant around when discussing what are and aren't issues and how to best convey that would probably get really confusing

UberFazz

Medal Hunter

162 posts

Joined July 2016

UberFazz 2021-08-19T23:58:03+00:00

Noffy wrote:
How do you think the future system should look based off the trial information and survey results?

I like the idea of allowing certain BNs to evaluate applicants (even though I disagree with doing it in waves, but more on that later), but this raises the question: What becomes of the NAT if the majority of their current work is handed over to BNs?

Disclaimer: This assumes BNs will have the same exact evaluating power as the NAT, similar to the trial.

----

If BNs are to take over evaluations, it'll leave the NAT with only managerial tasks, such as making announcements or implementing changes (like this post). Moderation is included as well, yet I'm unsure of how much moderation NAT members actually do.

As far as I'm aware, these kinds of activities are only limited to a select few NAT members, and it'll make the rest of the NAT nothing more than BNs with fancy titles.

So, what should we do?

Here are some of my ideas.

Leave the NAT as is and allow BNs to eval.

Probably the least appealing option (to me) would be to just roll with it and leave the current NAT as is. This could work, but it really puts into question the reason for keeping someone in the NAT. If a member's only contributions are evaluations, it seems inappropriate to place them in a different boat than the evaluating BNs. Granted, I can't know if any NAT members are in this position (as they could be contributing behind the scenes), but it's definitely possible.

*Edit: Additionally, I believe it's very unlikely for this to work long-term similar to why QAH did not work out. No incentive or recognition will cause very few to be interested, and those that are interested will burn out very quickly if all they're doing is essentially working for the NAT with none of the benefits of the NAT.

Re-organize and/or re-purpose the NAT and allow BNs to eval.

Similar to the first option except it involves separating NAT members that exclusively or almost exclusively do evals from members that contribute in other means that requires them to be in the NAT. This would likely mean moving certain members from the NAT to the BN but still allowing them to actively participate in evals.

This would also involve reworking the way new NAT members would be added. Currently, a big portion of being an NAT member involves being exceptional at evaluating applicants, with skill being shown off through mock evals. If the main responsibility of the NAT is no longer evals, how would NAT members be chosen? Would there be a big enough difference between BNs who are outspoken, GMT members whose main responsibility is related to mapping/modding, and the NAT?

The NAT responsibilities listed in this wiki article that aren't evaluations do not need to be done by the NAT specifically. The GMT can handle moderation, especially with the new waves of mapping/modding GMT, and like I've stated earlier, I don't think many NAT are very interested in this aspect anyway. Structural changes can be done by anyone by making relevant posts on the forums or GitHub. A quick look at the listed responsibilities shows that nearly half (5/11) of the (standard) NAT do not seem to be interested in anything besides evaluations (but they can still contribute in other means if they so choose).

This leads me to my final two options. To be perfectly honest, I have no idea how realistic they are, but this is just brainstorming anyway.

Continue letting BNs evaluate applicants in a similar system to the trial, but use it as a method of weeding out good NAT candidates instead of replacing NATs in general.

To me, this trial seemed like a much better method of seeing how well certain BNs would perform as NAT instead of outright replacing the NAT. This would make for rare waves, and they would only really be used when the NAT is in need of new members, similar to the current systems. Seems pretty simple, but there are certainly some issues with this.

Would this really solve the problem of not having enough manpower to effectively evaluate applicants when the NAT is hesitant to accept new members? Would it be sufficient enough to keep up with the constantly growing community when it's not self-sufficient like the other proposals? Unfortunately, there's no way to know.

The main issue that I've always seen mentioned is regarding moderation. NATs have mod powers, and you generally need to be especially trusted/known to be in a position that has access to site-wide moderation. If we want the NAT to keep their moderation abilities in tact, there's no real solution to this other than being forced to select only the NAT that can also be trusted with this kind of power. However, as I've said earlier, I don't believe moderation is important to the NAT.

This leads me to my final proposal.

Use the new system as a way of weeding out new NAT members, but strip the NAT of their access to site-wide moderation.

This seems like an ideal solution to me, but I'm unsure of how realistic it is. As far as I'm aware, many behind-the-scenes aspects of the community treat the NAT as moderators, so this would force some restructuring in those areas.

However, I'm not proposing for the current NAT to have their permissions revoked. Instead, my idea involves moving the current NAT to the "mapping/modding" category of the GMT, while also keeping them in the NAT. This would effectively make the current NAT the same as before, but would allow for much more liberty in future NAT selection.

This, in theory, resolves the previously mentioned issues regarding the addition of new NAT members. This is also partly why I'm against eval waves, as mentioned earlier. (I personally dislike the idea of only having temporary access to evals, since it would ruin the possibility of having a consistent workflow. I disagree that the absence of waves would cause for burnout or similar, since there should be enough people to manage the given work at any point in time.)

This would drive a clear separator between normal BNs who just nominate/disqualify maps, BNs that also evaluate applicants and current BNs (NAT), BNs that would also like to moderate and/or partake in managerial tasks (BN + GMT), and BNs that evaluate applicants and would like to moderate and/or partake in managerial tasks (NAT + GMT).

Roles would be much more defined, unlike how they would be without this implementation. You'd have no way to know if a BN was responsible for evaluations or not, for example.

*Edit: Thought about it some more and thought of a possible issue with this solution: How would these new NAT members be evaluated? How could they be properly tracked and made sure they weren't messing up?

Evaluations should be done for NAT members as well. How exactly I'm not totally sure of, but it would be preferable if outside voices could be heard to prevent an echo chamber. This would likely include BNs giving inputs on NAT performance, or we could return to a system with NAT leaders in an attempt to address this concern.

Also yeah, the 3-6mo in BN idea is nice too.

----

So those are my thoughts on the matter. I hope at least something can be taken away from this, and if not, it could serve as a nice little thought experiment. Thank you to all the NAT members that are constantly trying to improve the modding scene

Last edited by UberFazz 2021-08-20T15:24:28+00:00, edited 4 times in total.

-White

315 posts

Joined February 2020

-White 2021-08-20T00:01:18+00:00

Naxess wrote:
would be careful about including the applicant too early on; current feedback wall is basically eval notes cleaned up from unnecessary/poorly-worded/poor advice, and having the applicant around when discussing what are and aren't issues and how to best convey that would probably get really confusing

I agree that the applicant shouldn't be included too early, but on the other hand, not including them at all (current system) can (as it did for me) result in the feedback summary lacking any actual actionable steps to improve, and being so "safe" that the feedback only serves to confuse the applicant even more due to how vague and non specific it is. Things like "wording could be improved" doesn't actually improve anyone's wording, but having a discussion with the bns about how exactly they'd prefer to have seen it worded does.

Naxess

Nomination Assessment Team

390 posts

Joined March 2016

Naxess 2021-08-20T00:22:50+00:00

yeah we sorta assumed people would contact their evaluators if they had questions or anything was unclear, but people generally seem to avoid that, so swapping the roles and having evaluators contact the applicant to discuss could definitely help with that

we had a similar issue in mentorship where mentees ignored by their mentor would live with it rather than contacting the organizers about it, so we flipped that system and had organizers regularly checkup on mentees instead

Last edited by Naxess 2021-08-20T00:28:32+00:00, edited 3 times in total.

-White

315 posts

Joined February 2020

-White 2021-08-20T00:50:30+00:00

Well it's also a pain when you can't contact one individual for feedback. For my app I had to contact each evaluators (4) just to get their specific feedback. After the 2nd I was like "this isn't worth my time" and stopped. It's just a huge inconvenience for everyone to have it so separated I feel like, and a huge burden on the applicant to get the information that any one NAT has complete access to...

momoyo

Beatmap Nominator

228 posts

Joined June 2018

momoyo 2021-08-20T06:18:46+00:00

I couldn't agree more with UberFazz idea and I was actually thinking something like that but slightly different.

My main idea was the same with the only difference of not mixing 2 roles (GMT+NAT) but making a new role one or bringing QAT alive again, since NAT and GMT are 2 different roles when it comes to doing stuff in this game from what I know. The point is to move current NATs to QATs and revoking NATs permissions while QATs keep it the same, so new NATs should be able to focus on doing evals for Beatmap Nominators while the others can focus on other stuff (Such as Mappers' Guild site, other affairs, etc etc).
I know making a new role would require Dev's time but this is still something I believe should be considered in my opinion.

Akito

138 posts

Joined January 2015

Akito 2021-08-20T09:02:38+00:00

Personally I wasn't involved with the project but I've talked to BNs and heard complaints about evals so im just gonna post some of of the concerns I observed.

1. Varying standards among evaluators, including some people being much more lenient than the NAT. This can obviously lead to conflicting opinions and lower overall quality in evaluations. I've heard people on multiple occasions complain that "x BN will pass anyone/their standards are so low/do they even look at the mods".

2. The NAT are dead (I don't mean to undermine the work of the NAT in any way but this was also a common complaint). The post says 2 NAT and 2 BN members are assigned to each eval, but this doesn't mean they are the final evaluators. If the evaluation goes overdue anyone can take over, so it was very common for evaluations to be completed solely by BNG members with zero input from the NAT. Many people including some of the trial evaluators themselves have expressed that getting into the BNG right now is extremely easy because of this combined with the first issue.

This could also potentially lead to circlejerk if BN evaluators become a thing since it only takes a few of your friends to say yes on your application.

3. A small group of people carrying all of the work (this has already been addressed in the original post but yeah). After asking some people from the last trial group it seems like very few (maybe 2-4) were able to communicate well, maintain high standards, and have high activity simultaneously. I also heard countless people say things like "I don't even know why I signed up" which shows not many people are actually interested in being evaluators. This probably means the number of suitable candidates for evaluators in the entire current BNG is a only a small handful, and their commitment to doing evaluations is also uncertain.

Not really sure what the best move would be but making a group within the BNG for evaluators seems like a pretty unattractive option given these things.

Last edited by Akito 2021-08-20T15:56:53+00:00, edited 3 times in total.

AJT

529 posts

Joined August 2013

AJT 2021-08-20T12:43:00+00:00

pretty much agree with everything Akito said, leading to my view that I don't think this would be sustainable - I just think it's more structurally sound and less prone to exploitation to have it such that the people evaluating you are at a level "above" you, unless the people chosen to evaluate are heavily scrutinised and ensured to be very high quality BNs in all aspects and have fair judgement/reasoning. Otherwise it just feels kinda off getting evaluated by people who also have to be evaluated, especially when standards vary so much.
Also, I don't know how active the NAT was during the first wave but I feel like having these cycles just encourages them to make themselves scarce which leads to 4xBN evals and I don't really think that's ideal but it also can't be avoided if the NAT aren't doing their assignments in time. If there were either no BN evaluators or very few then the chances of having consistent NAT input would be a lot higher

also like Uber I don't think a cycle-based system would work very well - a lot of people already lost interest during their first waves, and I feel like there's only so long before even those who are still interested also become disinterested - and for the people who *do* remain interested, they probably wouldn't enjoy having to stop every other month or so. I feel like a lot of people simply signed up because you just had to click a reaction and they thought "hmm may as well try it out" but I don't really think most people would be interested in doing it long term. (I can't speak for everyone though)

Last edited by AJT 2021-08-20T13:25:37+00:00, edited 2 times in total.

Kibbleru

osu! Alumni

5,844 posts

Joined August 2013

Kibbleru 2021-08-20T14:40:55+00:00

So.. While I was pleasantly surprised by how well the first waves went, I worry that this wouldn't be sustainable (as much as I would really HOPE it to be).

So one of the comments really struck me as logical. We have had similar events in the past of NAT/QAT trying to hand over their duties to the BNG in the form of QAH. During the first, maybe even year, this was a successful thing, but as soon as the honeymoon phase was over, people's motivation started to dwindle, leading to the state of what QAH is in now..

The priority we should focus on is what kind of rewards, compensation will keep people going. Do we want to slap them on a title, name color, for some amount of time? badges?

A rotation cycle is a good idea as a start, but how many qualified individuals can we come across before we eventually run out? Even within the trials, we've found that it's been the same certain people keeping it running, but how long will their drive last?

What will be the cycle period? It takes some time to get fairly consistent in eval work. Too short and people won't really know what they're doing before they're cycled out, and too long you risk people losing motivation.

Last edited by Kibbleru 2021-08-20T14:42:51+00:00, edited 2 times in total.

UberFazz

Medal Hunter

162 posts

Joined July 2016

UberFazz 2021-08-20T15:10:04+00:00

While Akito's points make sense, the first issue is solvable by only selecting certain BNs and the second issue isn't even related to this new system and is instead an issue in itself that should be addressed by the NAT.

I'm still insistent on moving them to a separate group without mod perms though, instead of keeping them in the same group as other BNs. This would provide the mentioned incentive by Kibb (which I totally agree needs to exist, as there is currently zero incentive to do this unlike BN work), and it'd likely help avoid issues similar to what happened with QAH. It would also fix any issues with a cycle-based system by continuing to use the existing NAT system to evaluate applicants.

I really don't think a cycle-based system would work for reasons best explained by AJT.

We already have a system that works in the form of the Nomination Assessment Team. It seems much better to expand this current system rather than trying something else.

Kibbleru wrote:
What will be the cycle period? It takes some time to get fairly consistent in eval work. Too short and people won't really know what they're doing before they're cycled out, and too long you risk people losing motivation.

For clarity, Noffy mentioned that this period should be around a month, and that the 2-month period was "definitely too long."

Last edited by UberFazz 2021-08-20T15:12:48+00:00, edited 2 times in total.

Cheri

1,770 posts

Joined November 2014

Cheri 2021-08-20T15:49:21+00:00

Feeling a bit better than yesterday and just so I can at least have some input in the thread, I much rather you guys not trialed a cycle-based system period ^^;

First cycle already had some dead people (and even a couple who was active in up being dead later in the line) = some carried

2nd cycle was worst with that = same results

You guys pretty much already have an idea of how active people will be in a month and how only some will be able to keep up the pace with just this in mind and I just refuse to believe that the very people who is interested in continuing, would like the idea of being switch out every month or 2, when something as important as evaluations should be consistent and work with people's schedules irl

It just way too scuffed and I think it be more important to simply to look over the bng and just picked out those who actually did something within it as well as give others who didn't get to do anything a chance for a short period before closing up the group until needing more members instead of going with a guarantee going to be very unpolished way of going about things for the next couple months or so
(basically also agree w/ajt on the idea of it just seems harmful for the very few who would want to continue doing it lol)

I for one like ubers idea of just making a new group without mod perms since that could solve the cycle based idea, but regardless I am strongly against using the first idea as the final outcome of this experiment :/

Last edited by Cheri 2021-08-20T15:57:41+00:00, edited 6 times in total.

Nao Tomori

Nomination Assessment Team

3,089 posts

Joined December 2014

Nao Tomori 2021-08-20T16:43:36+00:00

I think established guidelines for what "good modding" looks like would go a long way. I remember Monstrata making a post about something like that a long time ago. The biggest issue I find recently is that many people who aren't "in the know" don't really get what BN app evaluations look for in their modding, and publishing guidelines to that effect would go a long way in educating the modding community about the types of issues to look for.

This obviously runs into the issues that were prevalent around 2015 though where if you don't say exactly the same stuff you don't get in and therefore a very specific set of standards ends up superseding everything else - but this can be avoided by keeping the guidelines to a very general sense of what BNs/NATs value in a good map or mod. By having common guidelines between what types of issues BN applicants should be looking for in the maps they're modding (like basic unrankables, contrast issues, spread issues, that stuff) and what BN app evaluators look for (clear and concise wording, not missing major issues), a lot of the need for appeals and general annoyance at getting denied repeatedly can be reduced or removed.

Stixy

osu! Alumni

114 posts

Joined September 2016

Stixy 2021-08-20T17:15:26+00:00

Just some quick thoughts, since I am on phone so formatting is hard etc.

Personally I like uber‘s ideas and feel like they could work out well. Would need some more in-depth discussion but yea.

Having it cycle based is rather counterintuitive imo and, just like others mentioned, makes it hard for people to get into it and be consistent etc.

Regarding Akito‘s points, I think the first point is solvable by only letting e.g. select people or something like that be part of it.

This would also somewhat help with the second point, as having only select individuals means having less people, therefore more NATs participating in evals. Or just make it that not more than 2 BNs can submit an eval on something.

People doing more stuff than others was always a thing and there is no direct solution to that since people have different priorities etc. I personally mostly did assigned stuff only during the trial, since this would be somewhat representative of the workload that ‚i can always expect‘

Mafumafu

Beatmap Nominator

3,367 posts

Joined August 2013

Mafumafu 2021-08-20T20:45:04+00:00

Want to put my thoughts about learning-feedback process of cultivating BN evaluators based on my experience in the first wave of the project.

Whatever the system will be I think one noticeable concept that could be kept is the mechanism of cultivating bn evaluators during this trial NAT project. Involving BNs in evaluation is not a new topic, as there are a great number of evaluations done by BNs before this project with a random selection / opt-in status. However, the BN evaluation was basically an independent task for its participants, since BN evaluators remain anonymous, it is very hard to obtain feedbacks or discuss with other evaluators. (Although you can actively ask for feedbacks but that's minimal) So basically it is very difficult, or inefficient to improve

As the participant of the first wave, what I think most meaningful and edifying to me is, through communications with your fellow trial NATs as well as the actual NATs, your evaluation capability can be "cultivated". There are experienced members that can give you feedbacks on your evaluations that are way more intensive and exhaustive than the BN evaluations before the project. Through this process I did learn a lot on the flow, principles, and detailed implementation of the BN evaluation process, which is impossible to achieve in the previous experience as BN evaluator.

As mentioned by yaspo in the report, although some trial NATs may be very active, they may not be very good at evaluation such as writing poorly-organized feedbacks or having even erroneous evaluations. I also agree with many points here in this thread that "not all the people can evaluate well". However, I would say, we should actively "cultivate" BN evaluators instead of passively "picking" BN evaluators. By letting BN involved in this trial NAT project, with the intensive feedbacks and in-group discussions, I think it does much better than the previous BN-as-evaluator scheme, as to “cultivating”

This also can help with the "motivation curve", i.e., members can first be very active and then gradually becomes inactive. One noticeable reason for members to lose interests on the BN evaluation can be the minimal feedback on how they did. Previously the BN evaluators are mostly working individually, not like in the part of the “Team” (which is in the title of NAT, also very important in the evaluation process imo). "What is the point of me still keeping doing this if this process is minimal in feedback or interaction with fellow evaluators? " On the other side, if you can get feedbacks, are able to involve yourself in the discussion process, possible to interactive with the fellow evaluators (during the discussion phase), I think more people will be willing to devote time and effort in it as you feel like you are working in a team instead of an individual.

Kudosu

Beatmap Nominator

109 posts

Joined October 2017

Kudosu 2021-08-20T23:15:44+00:00

i stand by what akito / ajt said mostly but imo a cycle system can be a good thing if it’s sole purpose is to scout future nats, literally making it trial nat

its kinda what was done before with mock evals but with how trial nat had same input as nat etc, i believe in that case 1 month cycles of like 5 person would work pretty well

reason i dont think full on bn evaluators will work: after taking part in the second trial nat cycle i dont believe that much candidates were good enough (reasons mentioned by everyone else, standards discussion etc). if we need to narrow the number of suitable candidate that much just making them nat is much simpler

Logic Agent

483 posts

Joined April 2015

Logic Agent 2021-08-21T23:41:27+00:00

wanted to pop in and say i agree with ubers final proposal, seems like the best case scenario for everyone.

also as someone who got extended probation, it just kinda feels strange getting evaluated & extended by someone 'on the same tier as you' so to say. like yes obviously some bns are more experienced than others (and my evaluators are more experienced than me, for starters due to the fact i was literally new to the role.) however i just thought that's something that i could add, and further supports uber's idea of splitting up the roles a bit more

Kibbleru

osu! Alumni

5,844 posts

Joined August 2013

Kibbleru 2021-09-05T01:53:48+00:00

VINXIS wrote:
When applying and failing the application, u are given a wall of text under a Feedback section on the website, where evaluators attempt to summarize what you should improve on based off of the notes written by the evaluators, and then the applicant is told to fix these and reapply later after.

In comparison to having more direct contact with the applicant, where for example u have the discussion/conversation (which Currently happens by evaluators themselves writing notes and after all 4+ evaluators finished writing notes and choosing if they pass/neutral/fail the mans) with the applicant at the time as well, and having some form of like "followups" for anyone that "doesnt pass at the time" in my head at least seems far more easier to convey more valuable information to applicants (and to ourselves)/get issues resolved faster/streamline the process.

Mainly I just think the discussions that are happening in the current system are missing the applicant currently, and are more valuable than the feedback wall that is sent in comparison, and if it is more valuable, would be easier/faster to fix what those issues are

A few weeks ago, I come upon a discord bot that basically functions as a ticketing system. You click a button and it creates a private channel for the user and moderators to talk. This could be potentially something we can have in the BN server.

I think it was https://tickettool.xyz/

Last edited by Kibbleru 2021-09-05T01:54:28+00:00, edited 1 time in total.

VINXIS

Featured Artist

3,161 posts

Joined April 2014

VINXIS 2021-09-05T03:58:57+00:00

This bot is insane

Sign In To Proceed

Don't have an account?

The BN Evaluators Trial Review and Discussion

Naxess wrote:

clayton wrote:

Naxess wrote:

-White wrote:

ikin5050 wrote:

Noffy wrote:

VINXIS wrote:

Noffy wrote:

Naxess wrote:

Kibbleru wrote:

VINXIS wrote:

New reply