I would like to start a related discussion on reintroducing the tiered BN system. Here’s the post for reference—I wonder if it will pique anyone’s interest: https://osu.ppy.sh/community/forums/topics/1899201?n=1
I don't know how a test that only shows your english comprehension (the original intend of the RC tests back in the days) or your competence to google regularities assures attitude.ikin5050 wrote:
Gamifying the process might make it more fun and I understand why my thoughts are inconsistent with that.
I ask you to consider this: If we gamify the process of becoming BN and remove tests does that not open the door to people's attitudes being less serious and their rigorousness/thoroughness in checking maps being lax?
Increasing transparency on a BN application's status
Reworking BN application feedbacks
Platform for communication between NAT and applicant post-applicationThese seem like good changes and are definitely long overdue so it's nice to see them added. My only concern is that individual feedback from each evaluator could potentially be inconsistent, contradictory or confusing due to different views, so when making these comments available to the applicant, it would be important to make sure that they're easily understandable and not conflicting with other evaluators, and to adjust them if necessary.
Removing the RC testI'm not sure how I feel about this one honestly. On one hand, it is indeed a tedious test for applicants, and maintaining it is probably tedious for the NAT as well. But on the other hand, it might still be a valuable tool to determine knowledge/comprehension about the ranking criteria, BN rules, modding code of conduct etc. While it's true that looking up the answers is possible, it still teaches people about these things which they might not even read or know about otherwise. Therefore it could be seen more as a lesson than an actual test, which I don't think is bad. Whether it helps filtering out incompetent candidates or not is another story and it's hard to judge without knowing how many people are passing/failing the test. Assuming that the vast majority of applicants pass the test, perhaps increasing the threshold for passing could be an option to make it a more efficient selection process. Adding more questions that really test the person's knowledge, understanding and judgement could also make it more useful.
Reworking BN applicationsFirst I want to address the proposed changes and some of the comments posted in this thread so far (I will talk about other issues and my own suggestions in a separate post later).
It's not trolling to give a valid reason not to nominate a map, just like "I don't like the map" is a valid reason. This is why the questions' objective should be very clear. If certain answers are not wanted, the questions should be formulated in a way that doesn't allow said answer to be given.achyoo wrote:
Goes without saying but please disclaimer to people to not answer "no dont like song" even though that's what all BNs do, since it gives nothing for you to judge.
Or maybe don't, let the people who troll wait 60 days more.
This is another example of "hidden expectations" within BN applications that have been a huge problem. There is no mention anywhere that a "safe map by an experienced mapper" is not desired, so expecting people to know this is unfair. Aside from the fact that there is no objective definition of such a map, I also don't get why it would not be appropriate to choose. Sure, if the map is very polished there might not be much to point out, but a good modder still has the possibility to find areas for improvement and mistakes that should be fixed. Also, aren't BNs supposed to nominate high quality maps (which are often created by experienced mappers)? This even shows they are capable of recognizing maps that are better than average.Nao Tomori wrote:
We are requesting one map judged as good (but not an overly safe map that has already been bubbled or by an experienced mapper)
Unfortunately I fail to see how these changes concretely achieve these goals or even move closer to them, at least from the perspective of applicants. The transparency improvements are good, but other than that I really don't think this would make things "less obtuse, easier to understand" or give "a sense of direction", it's just slightly different than before but essentially the same process.RandomeLoL wrote:
The goal of the changes wasn't so much as to outright reduce the bar of entry into the BNG, but to make it less obtuse, easier to understand, more transparent to the end user, and finally give a sense of direction of what exactly should be prioritized when evaluating someone's work which would affect the way applications are approached from both ends.
I don't think there's even a single person who is having fun or treating BN apps like a game. Even without the BN test, applicants are preparing for it like an exam and usually getting nervous. Sometimes people might "yolo" apply without caring too much but usually those attempts are not successful, unless they're very experienced.RandomeLoL wrote:
Users should have a better, more fun time. Submitting mods that they may've done at their own leisure and pace feels to be less restrictive and mentally taxing.
Regarding the individual feedback point, we already do this currently, it was quite labour intensive to go from writing thoughts individually to then collate it into one sole feedback message. It also serves as nice proof of our work for our own evaluations and an archive to look back on, so I think it's nice to keep.Nifty wrote:
Like this change. Previously, the only way to disagree with NAT feedback (assuming you privately asked for it in the first place, or asked somebody else due to the NAT ignoring you) was to rally against them on twitter. I only hope that NAT are willing to put in the effort to satisfy (or try to satisfy) applicants. If providing feedback was one of the most exhausting and delaying tasks, knowing what feedback was previously given, I can't see how making each NAT write more cohesive individual feedback will be less exhausting.
I do however think it is a bit silly to ask applicants to submit a map they would not nominate. I don't know any BNs who routinely mod maps they don't nominate, so I don't see why that would be something we would look for in a prospective BN. Not much information about the applicant would be given from a map they wouldn't nominate, especially when a BN's reason for not nominating a map is usually simply not liking the song, the map being too easy/hard, or some other really simple and boring reason.
Oh this sounds like we can make everyone a nominator lolDrum-Hitnormal wrote:
but this is fine since they dont cause any problems.
by being more indicative of reading comprehension than BN abilities.isn't that the entire point of the test? understanding the sometimes meticulous wording of RC is important when you run into new examples of the edge cases it was designed around.
Imo the proposed changes are already formulaic because they clearly lay out a setup similar to what people have already been using in their apps for years (1 nommable map, 1 bad map, 1 filler).Nao Tomori wrote:
Appreciate the feedback. Updated the wording on the first one. For the third map, it is supposed to be a wildcard for the applicant to "shore up" their app. It's supposed to be pretty vague - I wanted to avoid giving extremely specific instructions (well to be honest what I actually wanted was 2 of map 1 and 2 of map 2) so that it didn't become way too formulaic as it kind of has been in the last few years.
emm, now this sounds better, and I understand that you guys are looking for someone who can improve mapsets instead of a ranked section gatekeeper. but this way we might putting more pressure on checking them during the qualified phase, which people lack interest in. I think we will need to incentivize people to check rc related stuff during qualified phase, otherwise some errors might sneak through.Nao Tomori wrote:
I think leaving exactly what a BN check entails up to the applicant makes more sense as that's what they would actually do when BN checking a map. Then we can evaluate that.
The maps don't have to be varied (beyond the mapper name). The whole point of the 3rd one is to not be specific - not sure what kind of clarity is needed there as it's supposed to be a catch all for anything the applicant thinks is missing from their application. Also, we aren't trying to teach people how to mod from square one in the application form. There is no checklist of issues the applicant needs to mod or we won't accept them or something. As long as the submitted mods show an ability to add value to the maps and reflect a good understanding of modding, it's what we're looking for.
It is much clearer and better than the first post.Nao Tomori wrote:
Please see below for the updated application guidelines. Let us know your thoughts. As a note, the evaluation generally does not rely on the mapper responding to the mods at any point.
Map 1: Submit a "BN check" mod on a map that you believe is close to a rankable / nominatable state. If the mapper were to address your mods, you would immediately be ready to press the nominate button. The map should be by a mapper with 5 or less ranked maps. The map should not have any nominations at the time of submitting your application.
This is intended to provide information on your ability to conduct the final steps of the modding process as well as independently evaluate a map's overall rankability.
Map 2: Submit a mod on a map that you not would nominate unless significant improvements are made. Additionally, briefly explain why the map was not in a rankable state when you modded it; your modding should generally address those concerns. The map should include a full spread of difficulties. The map should be hosted by a different mapper than the first map (including collab participants).
This is intended to provide information on your issue identification skills, communication and wording, and ability to evaluate a map's overall rankability.
Map 3: Submit a mod on a map that, in your opinion, would be helpful to us in evaluating your ability to judge map quality and readiness. Indicate whether you would or would not nominate the map after your modding has been addressed. The map should be hosted by a different mapper than the first or second maps (including collab participants). The map should not have any nominations at the time of submitting your application.
This will provide you with an opportunity to further improve your application, keeping in mind the intentions stated in the descriptions of the previous submissions.
Does it have to be a map from a "new-ish" mapper? Might be pretty hard to find one at some point, I think ~10 ranked maps would be better tbh. idk if the figure is this low just for the sake of the applicants to post more mods cuz such mappers int more (I assume) but I do think map quality really depends, despite how many ranked maps the user has.Nao Tomori wrote:
Map 1: Submit a "BN check" mod on a map that you believe is close to a rankable / nominatable state. If the mapper were to address your mods, you would immediately be ready to press the nominate button. The map should be by a mapper with 5 or less ranked maps.
Refers to top diffs only or including low diffs collabs?Nao Tomori wrote:
The map should be hosted by a different mapper than the first or second maps (including collab participants)
1. Refers to Host-Hitomi wrote:
Refers to top diffs only or including low diffs collabs?
Also, if the 2 oszs (before mod and current) are still a thing, is it necessary to have them (specifically the after mods one)? Feels like that's just overcomplicating the process cuz even if the host messed up after applying mods ofc u won't nominate it and instead it will require more modding.
Nao Tomori wrote:
Map 1: Submit a "BN check" mod on a map that you believe is close to a rankable / nominatable state. If the mapper were to address your mods, you would immediately be ready to press the nominate button. The map should be by a mapper with 5 or less ranked maps. The map should not have any nominations at the time of submitting your application.
Map 2: Submit a mod on a map that you not would nominate unless significant improvements are made. Additionally, briefly explain why the map was not in a rankable state when you modded it; your modding should generally address those concerns. The map should include a full spread of difficulties. The map should be hosted by a different mapper than the first map (including collab participants).
This is intended to provide information on your issue identification skills, communication and wording, and ability to evaluate a map's overall rankability.
Does this new revised applications include the previous questions? i.e "Why do you think the map is ready to be nominated?"Nao Tomori wrote:
Map 1: Submit a "BN check" mod on a map that you believe is close to a rankable / nominatable state. If the mapper were to address your mods, you would immediately be ready to press the nominate button. The map should be by a mapper with 5 or less ranked maps. The map should not have any nominations at the time of submitting your application.
This is intended to provide information on your ability to conduct the final steps of the modding process as well as independently evaluate a map's overall rankability.
that option should have remained there. if it's not there we will need to add that backMirash wrote:
Will the fast rejoin thing be up
I believe that the difficulty of finding maps to use for application is overblown. The advice I always give modding mentees is to simply go to a BN that has their request log public, and mod maps from people that requested said BN. There's more maps suitable and available out there. If applicants still struggle to find maps, it's because they are going into maps looking for specific issues rather than modding a map and finding issues in the map.Serizawa Haruki wrote:
1.1) The "3 mod showcase system" is generally not ideal because even just finding 3 maps that meet the desired criteria can be very difficult and time consuming. A lot of maps are incomplete or very low quality which already makes them unfitting for the application. Some others are very high quality and therefore don't offer a lot to work with. Those that fall somewhere in between often have lots of mods already which again reduces the amount of content that can be used to demonstrate one's modding skills. Usually the maps are also supposed to contain specific issues in order to cover all possible aspects of mapping as expected.
Most applicants do the formulaic modding and most of them fail so I don't see why this is an issue. Anyway the new system already solves this by shifting focus from the mods themselves to the overall decision making and ability to judge maps so.Serizawa Haruki wrote:
1.2) The expectations/requirements are also problematic due to the fact that they lead to "artificial" mods that often don't reflect actual mods done by BNs. Applicants don't just mod any map or make any kind of mod, over time a specific formula has been developed which is supposed to meet these requirements. However, this doesn't properly measure someone's modding abilities but rather the ability to figure out what exactly evaluators are looking for and adapt accordingly. If you took random mods made by BNs and used them in a BN application, the result would most likely be negative since there are no such expectations from them while being a BN, so expecting them from applicants doesn't make much sense and is not realistic.
How is that a BN app problem, that's a mentorship problem. Modding mentors are teaching people to mod that way for BN app so everyone does it. It's funny because most people that do this don't pass so I don't know why they keep doing it.Serizawa Haruki wrote:
1.3) This way of modding for BN applications can have other drawbacks as well, such as copying certain mods/suggestions from other people without understanding them and using them in a different context where they might not even apply. It has led to modding becoming quite homogenized because people are treating it like there is one right way to mod and wanting to learn that in order to become BN, so they are often blindly following other modders by pointing out the same type of issues, using the same reasoning, wording, etc. In reality there are many different "modding styles" and none is objectively better than the other. Moreover, by trying to check all the boxes, modders might focus too much on finding potential issues and subconsciously mentioning things that are fine or exaggerating minor issues.
Hidden expectations is a fair argument, but I believe the new system solves it pretty adequately. What you need to do is clearly outlined, but I expect that people will take time to adapt and change their BN app methods so we can look back on this in a few months to see how applicants are doing.Serizawa Haruki wrote:
1.4) The evaluation criteria are unclear/vague and there are hidden expectations so modders essentially have to take a guess as to what they should and shouldn't do. While the attempt to improve this aspect as discussed in this thread is a good start, I still have doubts whether it really works in practice. This might also have to do with the way evaluations are done though, which brings me to my next point.
From my personal experience, evaluator's personal opinion and preferences don't have as much of an impact as you think. The evaluators will bring it up in group discussion but very rarely is it what makes the difference in the final evaluation outcome. Most of the biggest "mistakes" are judged based on what is intersubjective; using past DQ discussions and veto mediation outcomes as precedents on what needs to be enforced and what doesn't need to be. At least this is in my experience of being an evaluator for 8 months.Serizawa Haruki wrote:
1.5) Due to the subjective nature of mapping and modding, evaluations can differ widely from one NAT member to another. As such, there is a certain RNG component at play, which can make evaluations feel unfair. This also means that if an applicant has different views on map quality or on what is and isn't an issue in a map compared to an evaluator, it could impact their result negatively. While it might be impossible to avoid making judgements based on personal preferences, it should be reduced to a minimum.
Yea I agree, but I would assume that evaluators are aware of this and do try to look at big picture rather than focusing on mistakes. The new system should help because the applicant should be judged holistically and not based on their modding mistakes anymore (due to the new decision making judgement portion). BTW, 3 fails can still result in a pass, the vote is not final and whatever consensus comes out of group discussion is final rather than the vote itself. I'm not saying it happens regularly, but I'm saying it is a possibility.Serizawa Haruki wrote:
1.6) Unfortunately evaluations are also prone to bias in multiple ways. Firstly, there might be a subconscious bias towards negative aspects since the task consists of checking for mistakes the applicant might have made, similar to what I mentioned above regarding modders focusing too much on finding potential issues in a map. Mistakes and shortcomings seem to hold significantly more weight than things that were done well, so even if most aspects are positive, it can still result in a failure. Another thing to note is that according to the evaluation process, if the majority of the evaluators votes "fail", it results in the applicant automatically being denied. However, the same is not true for a majority of "pass" votes, again indicating a tendency towards negativity.
Substitutions are generally never done on a whim, it's only done when a) it goes overdue OR b) one of the evaluators specifically said they can't do a certain one, in which case it's usually rerolled, not handpicked. Disclaimer that this is based on my tenure and I cannot with 100% certainty say that it works this way now, but I think it's fair to assume they still do it this way.Serizawa Haruki wrote:
1.7) The other form of bias consists in favoring people someone likes or is friends with, and on the other hand opposing people they dislike. This is exacerbated by the fact that evaluators are not always exclusively randomized, but they can also assign themselves to an application in order to substitute someone or as an additional evaluator, giving them the possibility to skew the result.
I don't see why someone couldn't unlearn their skills in a few years though. A few months isn't even an argument because reapp is only needed for few months if they left on standard terms (which usually means they fucked up as a BN, in which case it's not that unlikely they fail??) Previous BN experience does play a role, most returning members prior to the instant rejoin button had massive leniency given to them. It also works the other way, former members that had a shaky tenure would have that held against them as well.Serizawa Haruki wrote:
1.8) Generally the aforementioned inconsistencies in evaluations are demonstrated by occurrences like former BNs or even former NAT members failing applications (or for example, Elite Nominators/former NAT members being kicked/probationed) which understandably raise some questions. It seems unlikely that competent modders forget or unlearn their skills in a few months or 1-2 years, so either problems have been overlooked previously and they were seen as better than they actually are, or the assessments don't do a good job at determining someone's capabilities. Previous BN experience should play a bigger role when assessing a candidate.
Let's be real who on osu is specifically trained on assessing behaviour. Both NAT and GMT as far as I know have access to the same set of rulebook and guidelines, and most big behavioural issues goes through GMT as well anyway. Unless you mean you want a higher up position to verify everything, which is kind of ridiculous.Serizawa Haruki wrote:
1.9) I find it questionable that the behavior of future and existing BNs is assessed by the NAT because they are not specifically educated/trained on how to do this and might not always be able to make fair calls about what's right or wrong. As some members have had incidents of misconduct themselves, they might not be the best candidates to judge how others act, and there have been examples of debatable decisions taken in this regard.
a) Evaluations can be public, just not from NAT side (always been the case even before the new update). The applicant is free to show their evaluation to anyone they want. Many don't though. NAT have no issues making evaluations public, but there are still BNs and applicants that would rather have the anonymity.Serizawa Haruki wrote:
1.10) All of this ties into the fact that there are little to no checks or consequences for subpar or unfair evaluations. This is of course a result of the NAT's self-regulation, but I think in part it also has to do with the fact that applications and their result are not visible to the public, so they are not subject to community opinions like qualified maps are for example, which can be a form of quality control. Apparently it is now possible to allow applications to be viewed publicly, but I'm not sure where they can be viewed by other users (was this explained anyhwere?). The inability for decisions to be appealed can elicit feelings of powerlessness in applicants as well.
I am cautiously optimistic about the new system, but speaking from past experience most people fail because they just lack the prerequisite knowledge to even begin modding anyway. Like you can't identify issues and give good solutions if you barely know mapping stuff. How is that going to fit in a feedback? If i were to be harsh, the correct play is to "quit modding, learn how to map then come back" but who wants to hear that?Serizawa Haruki wrote:
1.11) Whether the upcoming changes to how feedback is delivered are beneficial remains to be seen. Either way, the problematic aspect is not necessarily the feedback's format, but more importantly its content. Issues are often explained poorly or insufficiently, making it hard to understand for the person reading it. The provided reasoning is sometimes overly subjective and not supported by facts or evidence, as well as generally lacking helpful information on how to improve. The different and potentially contradicting answers from evaluators when asking further questions only add to the confusion, but this should hopefully be mitigated by the new unified communication method.
Consider ^ what I mentioned above; I just think most people are trying for BN before they're ready. There's just more people in standard that are overly eager to apply. BTW, when mock evaluations were a thing back then, the randomly rolled BN's opinions generally aligned with the NAT. During BN evaluators cycles, the BNs were generally even more strict than the NAT. So if you're wondering, it's a gamemode thing, not an individual thing. Discrepancy in pass rates across individuals can probably mostly be explained by just RNG. Also, like I mentioned before, the votes are NOT FINAL. Someone can pass BN app with 3 fails if group discussion goes positively. NATs can also sometimes overcompensate; when they see an applicant they think they other 2 will pass, they play devils advocate and try to point out the negatives to make group discussion phase more valuable. Similarly, if an NAT sees an applicant they think will be failed by the other evaluators, they can sometimes take a more positive outlook, similarly to make group phase more productive and less prone to narrow perspective. It happens sometimes, skews the vote a little. But again, the vote isn't final anyway.Serizawa Haruki wrote:
Next, I want to present and talk about some stats on the pass rate of BN applications. The data was taken on February 15th 2024 and is based on all-time evaluations from all current NAT members. I can share the complete spreadsheet if someone is interested.
2.1) The first thing that stands out is the large discrepancy between the different game modes:
osu! (standard): out of 526 total evaluations 169 passed = 32,13% pass rate
osu!taiko: out of 164 total evaluations 68 passed = 41,46% pass rate
osu!catch: out of 190 total evaluations 135 passed = 71,05% pass rate
osu!mania: out of 272 total evaluations 155 passed = 56,99% pass rate
A possible reason could be the size difference between for example standard and catch, but the number of successful applications being less than half in the former is still a huge gap. And when comparing the significant growth of mania in recent times, it nearly reached the same amount of BNs as standard (even surpassing it briefly), but the percentages seen above still differ notably, so this is likely not the only factor (if at all). Taiko is also on the lower side here, I'm not sure if it's related to the fact that there are several newer NAT members in this mode, but it just stuck out to me and also explains why there are not that many evals in total. So the question is: Is the skill level of modders across game modes so much different, is the learning curve higher or lower depending on the mode, or does each mode simply approach evaluations differently (stricter or more lenient)?
2.2) The other interesting aspect I noticed is how much the pass rates vary between individual members of each mode. The most notable one is osu! standard, where the highest rate is 45,83% and the lowest only 18,00%, and they are not outliers either, as there are some other similar values for other people. The only other mode where the numbers differ significantly across evaluations is taiko (24,32%-57,14%), however both the highest and the lowest one are outliers. Both mania (48,28%-61,76%) and especially catch (70,27%-75,00%) are closer together, which (coincidentally or not) are exactly the ones with the highest pass rates overall.
That's exactly the problem though, with the current system you have to look for maps with specific issues because not all maps are suitable so just picking any map doesn't work. A lot of times you start modding a map and at some point you realize it doesn't fulfill the role of a "BN app mod" so you have to find another one, increasing the effort of getting 3 appropriate maps together.achyoo wrote:
I'll try to give my thoughts on the points I actually have thoughts on
1.1) I believe that the difficulty of finding maps to use for application is overblown. The advice I always give modding mentees is to simply go to a BN that has their request log public, and mod maps from people that requested said BN. There's more maps suitable and available out there. If applicants still struggle to find maps, it's because they are going into maps looking for specific issues rather than modding a map and finding issues in the map.
I really don't believe this is the case, for example most BN app feedback continiously emphasized the importance of overarching mods, especially about contrast, emphasis and song representation, no matter if those issues were actually present in the submitted maps or not. The notion that smaller/less impactful suggestions are bad has also been pushed as another example. Whether people pass or not is irrelevant here as the reason why they fail is not specifically because of this formulaic modding, it's about the fact that certain types of modding are arbitrarily encouraged while others are discouraged instead of being open to different ones.achyoo wrote:
1.2) Most applicants do the formulaic modding and most of them fail so I don't see why this is an issue. Anyway the new system already solves this by shifting focus from the mods themselves to the overall decision making and ability to judge maps so.
1.3) How is that a BN app problem, that's a mentorship problem. Modding mentors are teaching people to mod that way for BN app so everyone does it. It's funny because most people that do this don't pass so I don't know why they keep doing it.
It's a little better perhaps but I really don't see this as a solution to the problem. There are still quite a few things that are unclear and were not really explained, for example I mentioned some of them here.achyoo wrote:
1.4) Hidden expectations is a fair argument, but I believe the new system solves it pretty adequately. What you need to do is clearly outlined, but I expect that people will take time to adapt and change their BN app methods so we can look back on this in a few months to see how applicants are doing.
The term "intersubjective" is often thrown around to justify precisely these kinds of things, but in reality there are no intersubjective standards that are shared by the whole mapping community or even a majority of it since it's so divided on certain opinions. NAT members might have similar views on map quality and use that as a standard, but it doesn't necessarily align with the views of BNs, mappers etc. And even then, they are not enforced consistently, in part because of evaluators' opinions differing from each other, as well as bias towards/against certain BNs, mappers or specific maps. The newly added public evaluation archives contain plenty of evidence that personal preferences are very much part of evaluations. I'm not saying it's the same as the 2015 QAT era, but it's definitely going in a similar direction, in the sense that instead of controlling which maps get ranked based on individual beliefs, what is being controlled is who gets BN based on those beliefs, ultimately affecting which maps get ranked. Also, using past DQ discussions and veto mediation outcomes as precedents also seems strange considering how much the NAT has insisted that previous ranked maps shouldn't be used as precedents and how each map should be judged in a vacuum.achyoo wrote:
1.5) From my personal experience, evaluator's personal opinion and preferences don't have as much of an impact as you think. The evaluators will bring it up in group discussion but very rarely is it what makes the difference in the final evaluation outcome. Most of the biggest "mistakes" are judged based on what is intersubjective; using past DQ discussions and veto mediation outcomes as precedents on what needs to be enforced and what doesn't need to be. At least this is in my experience of being an evaluator for 8 months.
Again, the new application system is only different in theory so far. Whether the actual evaluation process is different in practice can't be said for sure yet, as we don't know what it actually looks like and how it's executed.achyoo wrote:
1.6) Yea I agree, but I would assume that evaluators are aware of this and do try to look at big picture rather than focusing on mistakes. The new system should help because the applicant should be judged holistically and not based on their modding mistakes anymore (due to the new decision making judgement portion). BTW, 3 fails can still result in a pass, the vote is not final and whatever consensus comes out of group discussion is final rather than the vote itself. I'm not saying it happens regularly, but I'm saying it is a possibility.
From my experience a lot of evaluations go overdue so that doesn't seem like a rare occurrence. I don't know whether it's always rerolled or sometimes handpicked, but it would be good to have more transparency about this as well.achyoo wrote:
1.7) Substitutions are generally never done on a whim, it's only done when a) it goes overdue OR b) one of the evaluators specifically said they can't do a certain one, in which case it's usually rerolled, not handpicked. Disclaimer that this is based on my tenure and I cannot with 100% certainty say that it works this way now, but I think it's fair to assume they still do it this way.
Until recently, BNs could only instantly rejoin up to 6 months after resigning which is why I said "months", but even with the current 1 year time period, people don't just unlearn their skills like that. You might be a little rusty after not modding for a while, but after doing a few mods (which you have to do anyway in order to apply) you get back into it. Besides, not everyone that resigns from BN stops modding completely, many continue to mod every now and then. Yet there are still cases of failed applications for these former BNs. I'm not sure the "massive leniency" is actually a thing as you claim, or at least it's not applied to everyone equally.achyoo wrote:
1.8) I don't see why someone couldn't unlearn their skills in a few years though. A few months isn't even an argument because reapp is only needed for few months if they left on standard terms (which usually means they fucked up as a BN, in which case it's not that unlikely they fail??) Previous BN experience does play a role, most returning members prior to the instant rejoin button had massive leniency given to them. It also works the other way, former members that had a shaky tenure would have that held against them as well.
Yes, nobody is trained on assessing behaviour and it shows (including GMT). Having access to rulebooks and guidelines doesn't mean they are actually used and enforced properly. Of course an even higher position wouldn't solve the issue, but there should be more rigorous tests and checks in place to make sure NAT members are competent in this regard and held accountable for their actions. Given that BNs are not actually part of staff, I don't think it makes sense to police their attitude so strictly. Another contradiction is that BNs who take part in evaluations are also supposed to evaluate behavior while not being in a position to do so. So either BNs should be part of staff or not be expected to act how the NAT wants them to.achyoo wrote:
1.9) Let's be real who on osu is specifically trained on assessing behaviour. Both NAT and GMT as far as I know have access to the same set of rulebook and guidelines, and most big behavioural issues goes through GMT as well anyway. Unless you mean you want a higher up position to verify everything, which is kind of ridiculous.
Yes, people could always post their evaluations publicly, but most of the time there was just no reason to do so. The problem is that even if they did, nothing would happen. There are several cases of public outrage following certain BN's removal for example, but ultimately even if most people agreed it wasn't a legitimate decision there's nothing you could do about it.achyoo wrote:
1.10)
a) Evaluations can be public, just not from NAT side (always been the case even before the new update). The applicant is free to show their evaluation to anyone they want. Many don't though. NAT have no issues making evaluations public, but there are still BNs and applicants that would rather have the anonymity.
b) You can appeal, people just don't.
Not everyone is at that level where they lack basic mapping and modding knowledge, I'd say most people have some skills but need to improve quite a bit, so the feedback you're describing only applies to some cases. Others would definitely benefit from feedback that is explained better.achyoo wrote:
1.11) I am cautiously optimistic about the new system, but speaking from past experience most people fail because they just lack the prerequisite knowledge to even begin modding anyway. Like you can't identify issues and give good solutions if you barely know mapping stuff. How is that going to fit in a feedback? If i were to be harsh, the correct play is to "quit modding, learn how to map then come back" but who wants to hear that?
Also you said "The provided reasoning is sometimes overly subjective and not supported by facts or evidence, as well as generally lacking helpful information on how to improve."
I find it hilariously ironic because you touched on this point but your provided solutions do nothing to improve this, nor did you support your point with evidence as well.
Sure, a lot of people apply before they're ready. But if it's a gamemode thing as you said, why is this not the case as much in other modes? Do modders for those modes have access to more useful resources and guides? Is the community closer together and helping each other improve? Or are BN apps just evaluated differently or less strictly?achyoo wrote:
2.1 & 2.2) Consider ^ what I mentioned above; I just think most people are trying for BN before they're ready. There's just more people in standard that are overly eager to apply. BTW, when mock evaluations were a thing back then, the randomly rolled BN's opinions generally aligned with the NAT. During BN evaluators cycles, the BNs were generally even more strict than the NAT. So if you're wondering, it's a gamemode thing, not an individual thing. Discrepancy in pass rates across individuals can probably mostly be explained by just RNG. Also, like I mentioned before, the votes are NOT FINAL. Someone can pass BN app with 3 fails if group discussion goes positively. NATs can also sometimes overcompensate; when they see an applicant they think they other 2 will pass, they play devils advocate and try to point out the negatives to make group discussion phase more valuable. Similarly, if an NAT sees an applicant they think will be failed by the other evaluators, they can sometimes take a more positive outlook, similarly to make group phase more productive and less prone to narrow perspective. It happens sometimes, skews the vote a little. But again, the vote isn't final anyway.
The maps would obviously still be looked at, otherwise important context is missing. I do think the mods were being analyzed quite in detail until now, in the sense that every single suggestion was looked at and sometimes minor things were pointed out in the bn app feedback. By "recent modding history is looked at in general" I don't mean that the evaluators should look at all the mods from the past 6 months, but simply look at some of them. So if someone is looking at 3 maps, it's not more workload than before. If anything, it would be less work due to the less thorough check.achyoo wrote:
3.1 & 3.2) "recent modding history is looked at in general"
yea no. Either this means they are being evaluated purely on wording and not looking at the maps (which fucking sucks), or the evaluators have to download the maps and look at them, which even when not analyzed in a super detailed way (btw no one does that for evaluations, it's simply not feasible and takes too much time), is more workload than current 3 map system.
new system already tries to mitigate the "look at modder, not mods" issue by trying to understand the modder's thought process so we can see how it works.
I agree with the practical application thing, kinda sad trial BN was gone tbh.
If everyone is desperate for it, why is nobody doing anything then, especially the people who are in a position to make systemic changes? You say there is no good, feasible and effective solution, so what are the problems? I'm aware that my suggestions are probably nothing new or groundbreaking, but I still don't think it's impossible to implement anything like that, it's just that nobody has been putting in the effort to make it work, therefore I want to respark a discussion regarding this.achyoo wrote:
4.2) Qualified QA and the like have been discussed for almost half a decade at this point, nothing you brought up is new. If a good, feasible, and effective solution had been brought up in the past 5 years it would have already been implemented. Everyone is desperate for a working QA system not just you lol
Well, that's precisely why a system focused more on obvervation of real-world experience would be better as I've laid out in my suggestions.clayton wrote:
1.2 - 1.3 problems similar to this are found in every type of testing or interviewing culture, it sucks in some ways but I think it's difficult to legitimately evaluate someone's abilities without either a contrived setup or observation of real-world experience
What you said about top-down management doesn't really adress any of the issues I pointed out. Sure, this type of management can work but only under certain circumstances which are just not given right now.clayton wrote:
1.4 - 1.11 (okay I admit I skimmed it) I've never really been sympathetic toward complaints that these things are biased because I think a lot of personal weight from evaluators is part of a working system. how I envision the best form of this system anyway is that the NAT are experienced people who you can trust to moderate the mapping ecosystem & promote new people into its management, and it would only be a detriment if they are prevented from applying their judgement. obviously in practice not everyone is perfect and people want different things. but this style of top-down management actually works fine for me in terms of QC for a small team (despite my very vocal opposition to multiple parts of it, over the years...). I have ideas for other types of management that I feel could be worthwhile but they're less like "NAT should have a different process for xyz" and more like "start from nothing and refactor everything" lol so I will not say that here
The significance is that it indicates how much evaluations differ from one another, which matters because the standards should be fair and not affected by randomness as much as possible.clayton wrote:
2 these are individual people applying their own standards across multiple not-very-related gamemodes so tbh I don't see the significance of basically any of this.
Both could also just be combined into a single probation phase somehow, I just put it as an additional trial period in order to prevent the idea from being shut down immediately due to an increased risk of unprepared modders becoming BNs.clayton wrote:
3.2 I wouldn't mind if this system was just what it meant to be a probationary BN. but having both this and probation seems somewhat redundant, they are obviously different but serve a similar purpose (hold back their nominating ability while they begin to perform BN duties
I'm aware of the new communication method for questions about the feedback but that has nothing to do with it as it doesn't contain any information about appealing. And even if that text was removed from BN apps so long ago, why was nobody informed about this change? Again, can you explain how exactly the appeal process works? I've never heard or seen anyone do this in the past few years.achyoo wrote:
Yea that's why it was removed, to allow appeals (I was literally involved in the removal of that line lol). It must have been almost 2 years already and people have already tried to appeal.Serizawa Haruki wrote:
> You must be misinformed about appeals because for the longest time this was written at the bottom of every application: "The consensus of your evaluation is final and appeals will not be taken." It was recently removed from applications (even old ones) but without announcing or communicating this anywhere, so even if it's possible to do so now, people are obviously not aware of it. How does the process even work exactly if it's currently in place?
Also you aren't kept up. There's a literal box on the eval page to send in any queries now. Idk what else needs to be done tbh this is as low of a barrier of entry as it gets.
That wasn't the point, it was only an explanation as to why I was talking about examples where people reapplied after less than a year. Either way I don't get why you chose to respond to half a sentence that is not relevant to the point I'm making and ignored everything else. As I've said before, even with the 1 year grace period it doesn't change the issue at hand of former BNs failing applications, sometimes despite continuing modding during their break.achyoo wrote:
Similar to above it must have been more than a year already (edit: been told the duration increase was implemented 4 months ago, previous implementation was 6 months like u mentioned). Evidently the issue was recognized and implemented. I don't know why you keep bringing up the past when the issues you mentioned were already rectified.Serizawa Haruki wrote:
> Until recently, BNs could only instantly rejoin up to 6 months after resigning which is why I said "months",
That was obviously not my intention, it's just not possible for me to know about something that is being done behind the scenes. If there are people working on these things, that's great to hear, but it might be better to involve the community at large or at least inform them on what is going on. For example, the discussion in this thread was never followed up by staff.achyoo wrote:
you've probably just pissed off the people who have been trying so hard to make it work over the years behind the scenes lmao.Serizawa Haruki wrote:
> it's just that nobody has been putting in the effort to make it work,
One has nothing to do with the other though. What I meant by this is not that modders were expected to point out overarching problems when there were none, but they were expected to submit maps that had these specific issues. If none of the maps that were modded had that kind of general problems, it would impact the outcome negatively.achyoo wrote:
exaggerating issues is also a very common reject reason so idk what you are talking aboutSerizawa Haruki wrote:
> or example most BN app feedback continiously emphasized the importance of overarching mods, especially about contrast, emphasis and song representation, no matter if those issues were actually present in the submitted maps or not.
As stated at the very beginning of my post, these are absolutely not only my own experiences, but those of many users who participated in BN apps. The reason why it might not align with your experiences could be that you are viewing them from the position of an evaluator and not an applicant. I know that every evaluator used to be an applicant at some point, but for many this experience lies in the past and is easily overshadowed by what they saw and did in their role on the opposite end. Someone who has learned the required skills to succeed in becoming BN and who becomes familiar with the intricacies of the system might not always be able to perceive things with the same perspective as someone who hasn't.achyoo wrote:
everything else you wrote are pretty heavily based on your assumptions and experiences and it does not feel in line with my experiences so either you're extrapolating information wrongly from what you have or you have access to information that i dont. (do you?)
Yes, lowering the barrier of entry for BN (to a reasonable degree without impacting the quality of the ranked section) is one of the goals, but certainly not the only one. Your statement "You may as well petition to delete the usergroup" is incredibly reductive and does not contribute to a meaningful discussion at all. There is nothing wrong with disagreeing or questioning certain things about the current system. None of my critiques are meant to be personal attacks or anything, it's simply an attempt to improve the situation. So I'd appreciate if you could actually adress my points with proper counterarguments instead of trying to dismiss valid concerns merely due to disagreement.achyoo wrote:
all your suggestions essentially boil down to "lower the barrier of entry for BN", which is fair in the cases that are previously being judged too harshly (especially concerning things like wording, which i believe is an unfair barrier to non-native speakers of English). But everything else you are basically saying you do not agree with and do not have faith in the ability and judgement of NAT to correctly pick out who is and isn't suitable to be BN. In that case reworking the application system does nothing since the same people are in charge. You may as well petition to delete the usergroup.
Although i understand the sentiment, it might be quite frustrating/unfair for people to spend time modding a map, for it to be disqualified for bn app usage because a bn nominated it before they wrote the application.Nao Tomori wrote:
This is a fair concern. However, the point of this is to attempt to determine the applicant's quality standards and ability to evaluate maps in a vacuum. That objective is significantly impaired if the map has already been nominated. The requirement would only extend to maps not bubbled at the time of submission, not throughout the life of the application. As such, I view this as a necessary tradeoff to better accomplish the goals of the system.