forum

[Proposal - Beatmap] Limit file and difficulty names to printable ASCII characters

posted
Total Posts
9
Topic Starter
Ryu Sei
We often found problems in help forums from users can't seemingly to process the downloaded beatmaps correctly (community/forums/topics/1772102 , community/forums/topics/1765894 , community/forums/topics/1705902 , community/forums/topics/1692408 ) , and most of the problems lies in the fact that the file names the uploader used was non-ASCII characters. This might be caused from the beatmap audio, background or difficulty names. Because it isn't a rule to use non-ASCII characters for file names, one can technically rank a beatmap and it can cause problems for the future. As simple as it sounds, I propose this:
  1. Remove this guideline, because it might cause issues in future uploads.
    Avoid non-alphanumeric unicode characters in a difficulty's name. These can cause errors with the beatmap submission system and problems for certain users when appearing in chat.
  2. Add a new rule regarding file names and difficulty names.
    All file names and difficulty names must use printable ASCII characters. Using other characters can cause errors with the beatmap submission system and importing process.
    This should cover all needs of character limitation. Even if we don't mention about reserved characters, it should be not possible to use it at first place in file systems. For difficulty names, reserved characters are automatically trimmed by osu!, so we can ignore that.
Reference:
  1. Printable ASCII characters (character 32-126/20-7E)
    https://en.wikipedia.org/wiki/ASCII#Printable_characters
      ! " # $ % & ' ( ) * + , - . / 
    0 1 2 3 4 5 6 7 8 9 : ; < = > ? 
    @ A B C D E F G H I J K L M N O 
    P Q R S T U V W X Y Z [ \ ] ^ _ 
    ` a b c d e f g h i j k l m n o 
    p q r s t u v w x y z { | } ~
  2. Reserved characters ("illegal" characters in Windows systems)
    \ / : * ? " < > |
History
Add a new guideline regarding difficulty names.
Difficulty names containing square brackets symbol ( [ ] ) should be...
  1. ...at most 1 of each symbol when unpaired. Having multiple unpaired square brackets in succession may cause unintended behavior in chat.
  2. ...paired when using both square brackets. This also applies to nested square brackets ( [[ ]] ).
Combining square brackets in certain order will cause the chat to bug (as seen on my beatmap, beatmapsets/1860586 ). That should be put as guidelines instead because there are some uses of square brackets in difficulty names, such as difficulty name stylizing.
Additionally, community/forums/posts/9181474 explained in detail on how square brackets affect IRC chats.

This guideline proposal is not necessary due to its implication for limiting the creative difficulty names.
Protastic101
+1, have come across some issues with accessibility in the past due to non-ASCII characters in sample names and difficulty names. Was always weird trying to justify to the mapper why they needed to change it when it wasn't explicitly disallowed in RC.
Drum-Hitnormal
peppy should fix the osu client to prevent upload when its gonna break something later, much easier than wasting BN time for checking meta

or just make it as part of Mapset Verifier
Topic Starter
Ryu Sei
I'm not sure, maybe it's a feature all along instead of an issue?

Mapset Verifier is already long unupdated. It doesn't catch up with latest RC changes yet, so unless someone forked a better MV, we're stuck.

Just like other proposals, if peppy won't fix it, we just force it from RC. We can later revert the rules if peppy implemented this as a feature.
lewski

Drum-Hitnormal wrote:

peppy should fix the osu client to prevent upload when its gonna break something later, much easier than wasting BN time for checking meta

or just make it as part of Mapset Verifier
I agree with this in principle but expecting peppy to do any of that is unrealistic to say the least


as for the actual proposal, I'm fine with the first change, although I see no reason to remove the part about chat problems in the justification as nothing you've said suggests that it's no longer a part of the issue:

All file names and difficulty names must use printable ASCII characters. Using other characters can cause errors with the beatmap submission system and importing process as well as problems for certain users when appearing in chat.

I also think the part about square brackets should be refined to more accurately tackle the actual issue with them. Based on searching "difficulty=[" and "difficulty=]", the vast majority of diffnames that use square brackets only have a single pair, usually around a word. As far as I can tell, this causes no issues with /np, so this usage would likely be an acceptable reason to break the proposed guideline. It would be extremely counterintuitive to have a guideline that the vast majority of potentially violating maps actually just bypass, so ideally, the guideline should be more specific.

Chat links seem to follow the format [<link> <link text>], with [https://example.com funny link] resulting in a link displaying the text "funny link" and leading to https://example.com. Based on my testing, this seems to break whenever the link text contains any unpaired square brackets, but not with any configuration of correctly paired brackets. This is what any rule or guideline regarding square brackets should target.

Additionally, when /np is used while playing or editing a map, the diffname is enclosed in square brackets. This doesn't break any correctly paired brackets in the diffname, but it creates a special case that wouldn't work without the enclosing brackets: having both one unpaired ] and one unpaired [ in the diffname doesn't break the link because they pair with the brackets around the diffname. Right now, allowing this wouldn't really break anything, but it's such a niche case that adding an explicit allowance for it seems a bit silly, plus it could theoretically break in the future.

Furthermore, enclosing a piece of text in [[double square brackets]] creates a link leading to a page with that title on the wiki. Normal links take priority over this behaviour, so using multiple nested pairs of brackets in a diffname doesn't cause any issues with /np. However, when such a diffname is mentioned in chat outside of a link, it does result in an odd-looking link to a likely non-existent wiki page.

I'm not sure how important it is to try to tackle the last point, because although having a wiki link in the middle of a diffname does look weird, I think it's safe to say that normal users are extremely unlikely to fully type out such a diffname. Bots such as BanchoBot or Tillerino could reasonably do so, though, and Tillerino specifically also encounters the issue even with diffnames that are fully enclosed in just a single pair of brackets since it adds another pair of brackets to any diffname. Would appreciate thoughts on the matter.

With the above in mind, if we do need to control how square brackets are used in diffnames (which I'm honestly still pretty ambivalent about), the rule or guideline should specifically be about unpaired square brackets, potentially with a note about nested brackets if deemed necessary.
Topic Starter
Ryu Sei
Good point, I will update the proposed guidelines regarding squared brackets.

Difficulty names containing square brackets symbol ( [ ] ) should be...
  1. ...at most 1 of each symbol when unpaired. Having multiple unpaired square brackets in succession may cause unintended behavior in chat.
  2. ...paired when using both square brackets. This also applies to nested square brackets ( [[ ]] ).

If it's too much of a burden to make guidelines regarding squared brackets, I am more than happy to expunge the guideline proposal.
Topic Starter
Ryu Sei
I have to come into conclusion that adding guidelines regarding brackets are not necessary. Therefore, the proposal will be modified to only contain the main idea for this topic: limiting characters allowed to use in file names and difficulty names.

EDIT: I traced back peppy's post on Github ( https://github.com/ppy/osu-stable-issues/issues/944#issuecomment-1046474632 ) and said that it will require infrastructure change. So, unless lazer already support map exporting with non-ASCII characters, we should escalate this as rule instead.
clayton
the fact that this causes such breaking issues with stable makes it highly unlikely to be ranked even without writing anything in RC -- i dont think anything needs to change

and writing this in RC won't stop unranked maps from containing similar issues as referenced in the help forum posts.
Okoayu
agree with clayton, but would probably be good to review the current guideline this proposal wants to change in rc cleanup
Please sign in to reply.

New reply