forum

[assigned] [Rewrite] Metadata Section

posted
Total Posts
38
Topic Starter
Okoratu

READ THE DRAFT HERE



Hi!

In our effort to update the RC to be more relevant and easier to comprehend we rearranged this section quite a bit.
  1. The draft is ordered by target field or category that each rule, guideline, allowance applies to
  2. We took the existing criteria as a base and added explanations to edge cases where appropriate (everywhere) (real)
As the draft contains formatting that cannot be replicated in BBCode, please read the draft on GitHub. We focused on providing more examples to explain the more esoteric cases.
Most of the examples came out hilarious while illustrating the point clearly, we hope to keep them around if possible

Please leave comments, suggestions and questions over here!
This draft has, to my knowledge, no dependencies on other sections as far as its formatting and content is concerned.

Nice-to-have Stuff we don't have yet

We want to include a guide & flowchart for the primary sources article, or alternatively a guide on which hoops to jump through to find metadata. If you want to help with that feel free to reach out!

Feedback

Please use this forum thread to drop feedback on the draft! Alternatively, for smaller questions and such join the RC discord instead!
First round of revision starts either when feedback on this thread dies down or two weeks from now on 2024-03-23 (yyyy-mm-dd)

Thanks for reading!

ToDo list currently empty
Noffy
thank you oko \o/
Drum-Hitnormal
very nice formatting making it clear to understand

from what i see on new RC, it seems to say Inori Minase is allowed, Minase Inori is not allowed because theres official romanization for her


is this correct?

im against this idea cuz it makes name look weird when theres multiple arists and name order is diffetent despite they all japanese

also artist may get official romanisation after maps are ranked making it inconsistent with past ranked
Noffy

Drum-Hitnormal wrote:

very nice formatting making it clear to understand

from what i see on new RC, it seems to say Inori Minase is allowed, Minase Inori is not allowed because theres official romanization for her


is this correct?

im against this idea cuz it makes name look weird when theres multiple arists and name order is diffetent despite they all japanese

also artist may get official romanisation after maps are ranked making it inconsistent with past ranked
It's a guideline based off what's commonly seen and recognized, so if it's debatable for any reason that should ideally be decided on per case. Not a hard set rule right there
aceticke

Drum-Hitnormal wrote:

very nice formatting making it clear to understand

from what i see on new RC, it seems to say Inori Minase is allowed, Minase Inori is not allowed because theres official romanization for her


is this correct?

im against this idea cuz it makes name look weird when theres multiple arists and name order is diffetent despite they all japanese

also artist may get official romanisation after maps are ranked making it inconsistent with past ranked
“Aim to match Ranked or Loved beatmaps. Follow what is most recent and common, then verify that metadata is correct and fix as needed.”

Because of this guideline, you wouldn't be required to use Inori Minase. Furthermore, she uses both regularly which function as two official romanisations, there is no requirement to use either.
momoyo

Drum-Hitnormal wrote:

from what i see on new RC, it seems to say Inori Minase is allowed, Minase Inori is not allowed because theres official romanization for her
Hmm I think it can work woth ways, we had a talk in BN server about it a few weeks prior here.

Which makes me disagree with the Artist name order rework since it will cause situations like what Doormat said, I think there should be some sort of leniency with it in some cases like (CV: Minase Inori) as an example. I personally think the current RC about this is fine.

Edit:

Noffy wrote:

It's a guideline based off what's commonly seen and recognized, so if it's debatable for any reason that should ideally be decided on per case. Not a hard set rule right there
It's clearly stated as a Rule in the draft though, therefore it should either be stated as a guideline or be taken as something to follow 100% of the times no?
Noffy
Oh I see it now, I'll put that on the to-fix list because we covered it as an optional but extremely encouraged item like 3 other times O.o

E: gist should now be fixed to remove the contradictory statement, /o/
Topic Starter
Okoratu

Noffy wrote:

Oh I see it now, I'll put that on the to-fix list because we covered it as an optional but extremely encouraged item like 3 other times O.o

E: gist should now be fixed to remove the contradictory statement, /o/
Yea, the draft had this



which is slightly different and more shitposty but fixed now, as it was clearly a communication error :D
SuzumeAyase


Uhh... what's the difference?
Topic Starter
Okoratu
they could be interpreted as errors or typos but clearly are stylistic choices
SuzumeAyase

Okoratu wrote:

they could be interpreted as errors or typos but clearly are stylistic choices
I see... okay understandable
momoyo
1) "Any form of feat., feat, ft., featuring, etc. that are indicating an artist featured in the song must be written as feat."

Having "feat." at the beginning looks like a typo not gonna lie

2) I'm very happy to see the Marker part of the RC be clarified more but I think the wording could be a little bit better. Maybe something like this

Before: (Game Ver.) "Use this when there is an existing length marker such as ~Game Size~, (Game Size), game OP edit, OP Version."

After: (Game Ver.) "Songs from a video game with existing markers such as ~Game Size~, (Game Size), game OP edit, OP Version. must be standardised to (Game Ver.)

^ Same would apply to Movie Ver. perhaps, added "must" instead because it's something that should be followed always I believe

Also may I ask what's gonna happen with (Short Ver.) Markers? So far we still are using it including some FA songs that were licenced that have that metadata. e.g beatmapsets/2121497
Monoseul
This might be edited with more questions/suggestions as I go

Looks amazing so far, so much better than what we have right now.

THOUGHTS:

1.
For the symbols table in the "Handling Symbols" field, this is in the table right below "Recommended Romanisation":

- "When multiple options exist, the one used for romanisation depends on context."

It's minor but having this in the same box as "recommended romanisation" is a little stuffed. It's probably for emphasis, but as a reader we tend to read what the row means, and any extra info is somewhere else. It's a bit counterintuitive.
I feel like that info should be put somewhere below the table instead of the same box that defines the row it's in. More cohesive.

-
2a.

Under "Rules" in the Artist segment, there's:

- Commas must have a trailing space unless intentionally stylised otherwise.

Add a comma after "space" pls xD it's easy to gloss over text without a punctuation for a pause.

2b.
Also under the Artists segment, in "Marker Rules" there's:

- Any form of vs., versus, etc. indicating collaboration between artists must be written as vs..

Should the first "vs." be "vs"? This repeats twice (second time at the end.)

-
3.
Under the Guidelines for Artists:

- For doujin circles, you should use any of the following options:

Just a little confused with this one. The word choice implies it's up to the mapper's choice to pick one of these, which sounds like potential for an inconsistent metadata mess :')
If it's a case-by-case basis kind of thing, I'd replace "any" with "one". One word change but it helps avoid this I think.
Topic Starter
Okoratu

momoyo wrote:

1) "Any form of feat., feat, ft., featuring, etc. that are indicating an artist featured in the song must be written as feat."

Having "feat." at the beginning looks like a typo not gonna lie

2) I'm very happy to see the Marker part of the RC be clarified more but I think the wording could be a little bit better. Maybe something like this

Before: (Game Ver.) "Use this when there is an existing length marker such as ~Game Size~, (Game Size), game OP edit, OP Version."

After: (Game Ver.) "Songs from a video game with existing markers such as ~Game Size~, (Game Size), game OP edit, OP Version. must be standardised to (Game Ver.)

^ Same would apply to Movie Ver. perhaps, added "must" instead because it's something that should be followed always I believe

Also may I ask what's gonna happen with (Short Ver.) Markers? So far we still are using it including some FA songs that were licenced that have that metadata. e.g beatmapsets/2121497

Added a small extension to Game Ver
i think movie ver is fine as is

Short ver folds into #### Ver, but i think it's common enough to justify having

Monoseul:
1) will add some form of clarification
2a done
2b clarified and moved around
Topic Starter
Okoratu
re momoyo short ver: fixed
momoyo
Sounds good now :D I have nothing else to mention I think
Monoseul
I'm back again part 2


Monoseul wrote:

3.
Under the Guidelines for Artists:

- For doujin circles, you should use any of the following options:

Just a little confused with this one. The word choice implies it's up to the mapper's choice to pick one of these, which sounds like potential for an inconsistent metadata mess :')
If it's a case-by-case basis kind of thing, I'd replace "any" with "one". One word change but it helps avoid this I think.
Didn't get any response from this, could double check this one

4.
Under "Guidelines" in the Source section.

- If a track...

> was first released and later featured or tied to a piece of media, using the source field is optional.
> has been featured in multiple pieces of media, any option can be used as the source. Website names are only valid sources, if the song...
> and website are tied to specific cultural phenomena such as Newgrounds, etc.
> was composed as a website theme or background song.


I'm confused about the formatting between the second and third points. The 2nd point cuts off and 3rd seems to be a continuation, not sure if this is a mistake or it's intentional but it's pretty confusing atm when reading through the current format.

-
5.
Under "Cyrillic Writing" for Romanisation.

> Use the BGN/PCGN System.
> Е and е to ye if it stands alone or after a, e, ё, и, о, у, ы, э, ю, я, й, ъ, ь. In other cases, use e.
> ё to o if it comes after ж, ч, ш, or щ. In other cases, use yo.
> Ignore any other rules in the file provided, as these are either irrelevant or would not help in the game.
> For other characters, refer to the first page of this document


Third point is a little confusing. If it's referring to the document below it I think the 3rd and 4th point should be switched - reader can easily assume it was referring to the first point but that's just a wiki page of the system.
Also, pedantic but, would change "refer to the first page of this document" to "refer to the first two pages of this document" since the characters list last for two pages. Clarity :')

-
Other than that I don't have much else to say :-) anything about the content itself I don't really have a comment on. Hope this gets through at some point~
Topic Starter
Okoratu

Monoseul wrote:

I'm back again part 2


Monoseul wrote:

3.
Under the Guidelines for Artists:

- For doujin circles, you should use any of the following options:

Just a little confused with this one. The word choice implies it's up to the mapper's choice to pick one of these, which sounds like potential for an inconsistent metadata mess :')
If it's a case-by-case basis kind of thing, I'd replace "any" with "one". One word change but it helps avoid this I think.
Didn't get any response from this, could double check this one
Nah. The choice is up toe the mapper and what's most recognizable here and what the doujin circle commonly uses - it's how doujin music has been handled historically. People have been arguing for the formatting that looks "neatest" and we dont want to take that away

Monoseul wrote:

4.
Under "Guidelines" in the Source section.

- If a track...

> was first released and later featured or tied to a piece of media, using the source field is optional.
> has been featured in multiple pieces of media, any option can be used as the source. Website names are only valid sources, if the song...
> and website are tied to specific cultural phenomena such as Newgrounds, etc.
> was composed as a website theme or background song.


I'm confused about the formatting between the second and third points. The 2nd point cuts off and 3rd seems to be a continuation, not sure if this is a mistake or it's intentional but it's pretty confusing atm when reading through the current format.
it's not supposed to be like that, fixed...

Monoseul wrote:

-
5.
Under "Cyrillic Writing" for Romanisation.

> Use the BGN/PCGN System.
> Е and е to ye if it stands alone or after a, e, ё, и, о, у, ы, э, ю, я, й, ъ, ь. In other cases, use e.
> ё to o if it comes after ж, ч, ш, or щ. In other cases, use yo.
> Ignore any other rules in the file provided, as these are either irrelevant or would not help in the game.
> For other characters, refer to the first page of this document


Third point is a little confusing. If it's referring to the document below it I think the 3rd and 4th point should be switched - reader can easily assume it was referring to the first point but that's just a wiki page of the system.
Also, pedantic but, would change "refer to the first page of this document" to "refer to the first two pages of this document" since the characters list last for two pages. Clarity :')

-
Other than that I don't have much else to say :-) anything about the content itself I don't really have a comment on. Hope this gets through at some point~
oops yea switched these around
Serizawa Haruki
The structure is way better and easier to understand overall. I just have some minor corrections and suggestions:

General -> Rules:
- Metadata fields, except for tags, over 81 characters must be shortened.
I think it would sound more natural to change the order within the sentence like this: Metadata fields over 81 characters, except for tags, must be shortened.

General -> Guidelines -> When multiple metadata options are available:
Aim to match Ranked or Loved beatmaps. Follow what is most recent and common, then verify that metadata is correct and fix as needed.
The exception "This does not apply if the artist intentionally uses a different alias for different song or album releases." should probably be added here.

General -> Guidelines -> When multiple metadata options are available:
- Official romanisations/translations are preferred for romanised fields, so long as they are easily found and commonly recognised.
I tried bringing this up before but it was never addressed so I'll say it again: Official translations should not be preferred for romanised fields because unofficial translations don't exist (or at least are not allowed) so the only thing they can be preferred over are unofficial romanisations, which doesn't make much sense because it's called romanised field after all so giving priority to translations is not very logical, they should be optional.
This point also conflicts with the sentence under the Romanisation section "If the metadata has an official translation or romanisation, it is allowed to be used as-is in the romanised fields instead of romanising it yourself." because that sounds like an allowance and not like a preference/priority.

General -> Allowances -> For remixes, covers, or performances:
- The original artist may be used in the artist field, as long as the title field is modified to show that the song is not the original version. This marker should be in parentheses and contain the remix/cover artist or the performer as well as a descriptor. For example, the track triangles composed by cYsmix covered by mocha4life can be formatted as cYsmix - triangles (mocha4life Cover).
The explanation regarding the formatting is ambiguous because first it says "This marker should be..." and then "can be formatted as...", if it's an allowance it should probably just be "can/may" and not "should", right?

Handling Symbols
This could just be simplified to "Symbols" to make it more consistent with the other headers.

Handling Symbols -> Rules:
- The following Unicode Symbol subsets should have leading and trailing spaces when they can be romanised:
Supplemental Arrows-A, Supplemental Arrows-B, Miscellaneous Symbols and Arrow
Dingbats
Miscellaneous Symbols

This does not apply if the artist purposefully uses symbols in ways that do not suggest spaces. Other character sets are handled on a case-by-case basis.
Why are these specific subets of symbols listed here? Are there other categories where this rule doesn't apply?

Artist -> Marker Rules
The term "marker" here isn't ideal because it's also used for Title -> Markers where it refers to different things. It would probably be clearer if a different term was used here or if the other markers are labaled as "Length/Version Markers" to distinguish them.

Artist -> Marker Rules -> Character (CV: Voice Actor) and Character (VO: Voice Actor):
- If other artist(s) use a similar marker, such as c.v., CV., ~cv~, etc., replace it with this format.
This could be written in an easier way like "Similar markers, such as c.v., CV., ~cv~, etc., must be replaced with this format."
Was the point about "For live action, credit the voice actor only." left out intentionally?

Artist -> Allowances:
- Artists may be simply listed with ,, & or + in between each artist.
Instead of "+" I'd use "x" here because it's much more common. Also, I assume this is not supposed to be an exhaustive list of symbols, therefore "etc." should be added to indicate that.

Title -> Rules -> When a track is made of two or more songs you must do either one of the following:
- List the titles clearly with a dividing symbol in-between such as ,, &, +, /, etc.
Again I'd replace "+" with a more common symbol/term, in this case "vs.", for example used in beatmapsets/325158 and beatmapsets/707380.

Markers
Is the reason why some of them are listed under Rules and some under Guidelines because the first group is always added and the second group only in case the song already has the marker is some form? I can't think of a concrete term but maybe there's a clearer way to indicate that these are 2 separate lists because right now they're both just labeled as "Marker Categories". I get that each of them is within the Rules and Guidelines section respectively, but still.

Markers -> Rules:
- Songs with length markers must fully replace them with the standard marker.
Perhaps change this to "Songs with existing length markers must fully replace them with the relevant marker from the list below." to make it clearer.

Markers -> Rules:
- Songs without a length marker that fit a rules marker category must add the corresponding one at the end.
This one also took me a few times of re-reading it to understand what it means, it could be changed to something like this to explain it better: "Songs without a length marker that fit one of the marker categories below must add the corresponding one at the end.

Markers -> Rules -> (Cut Ver.)
I think it would be a good idea to add "Songs that are a full loop of a looping track will not be considered cut." as clarification.

Markers -> Guidelines
The 2 sentences at the beginning should be listed with bullet points because they're guidelines themselves, not just an explanation for this section.
It would also be nice if the order of (Short Ver.) and (Game Ver.) was reversed so that they're in order from most common to least common.

Markers -> Guidelines -> Marker Categories -> (Game Ver.):
- Commonly found in video game songs. Use this marker when there is an existing length marker such as ~Game Size~, (Game Size), game OP edit, OP Version.
The part about "Commonly found in video game songs" could be removed entirely because it should be obvious and also doesn't really provide useful information for the guideline.
I'm also not sure about "OP Version" because this is different from the others in the sense that it doesn't include the word "Game" at all. Are there examples where this has been changed to (Game Ver.)?

Markers -> Guidelines -> Marker Categories -> (Short Ver.):
- Usually used in game openings to signal that a longer version actually exists.
Similarly, this part seems rather unnecessary and could be omitted.

Markers -> Guidelines -> Marker Categories -> (##### Ver.):
When song titles already have a length marker not covered above, it should be changed to a descriptive (#### Ver.) marker using title case.
What does "descriptive" refer to here?
There should be an additional "#" here to be consistent with the first mention of this marker.
I also think it would be important to add an example for this one to make it easier to understand the use cases.

Markers -> Guidelines -> Marker Categories -> (##### Ver.):
Exceptions would be for when the length marker is so stylised it is considered part of the title, such as Pippiquest (Pippi x Mocha Romantic Movie Remix Edition)
Is this strictly about stylisation or just having an extravagant version naming? Assuming it's the latter, I don't really see why it wouldn't still fall under the same guideline since you could very well just replace "Edition" with "Ver." without ruining the naming scheme. Stylisation would be more related to things like formatting (the entire point of standardized markers is to make formatting consistent) and capitalisation (which is already addressed in the allowance of this section).

Markers -> Allowances:
- Alternate casing for markers may be used if the rest of the song title is stylised to fit the formatting.
"Alternate" should probably be changed to "Alternative" to match the previous instance of this allowance.

Sources -> Rules:
The Source field must be precise. Use the most specific source instead of series or project names, unless multiple sources within a series apply.
The word "general" could be added before "series or project names" to emphasize that it's an umbrella term.

Sources -> Guidelines
Bullet points should be added before "If a track..." and "Website names are only valid sources, if the song...".

Tags -> Rules -> Tags must include the following items when applicable:
- Guest difficulty creators, storyboarders, skinners, and hitsounders.
I'd reword this to "Guest difficulty, hitsound, storyboard and skin creators." because "skinners" sounds strange and it's more consistent this way.
It could be useful to add this sentence from the current ranking criteria because it's something a lot of people don't know or get wrong often: "Usernames in tags containing single characters separated by spaces must have the spaces replaced with underscores."

Tags -> Rules -> Tags must include the following items when applicable -> At least one song genre and one language tag:
- If the lyrics in the song have no meaning, use other as the language tag.
I don't think putting "other" in the tags is useful, in this case it would be better to just use "gibberish" or something like that.

Tags -> Rules -> Tags must include the following items when applicable:
- Tags must be related to the beatmap. Describing the style, song, storyboard, video, or background content is fine.
The "is fine" part sounds a bit weird here, it could be simplified to "Tags must be related to the beatmap, such as describing the style, song, storyboard, video, or background content."
Maybe also add the part about tags not being misleading from the current RC?

Tags -> Guidelines -> Tags should include the following items when applicable:
- Easily searched versions of difficult-to-write parts of the metadata.
I suggest adding "... such as contractions with the apostrophe removed" since it's a common example.

Tags -> Guidelines -> Tags should include the following items when applicable
Another bullet point mentioning things like album/EP/single name could be added.

Romanisation -> Language and Writing-system Romanisation Rules -> Japanese:
- ā to aa, ū to uu, ē to ee
Shouldn't there also be "ī to ii" here (even if it's rare)?

Romanisation -> Language and Writing-system Romanisation Rules -> Chinese:
For other dialects: Left to mapper's discretion, contacting a native speaker is recommended.
In my opinion this would sound better as a complete sentence like "For other dialects, it's left to the mapper's discretion. Contacting a native speaker is recommended."

Romanisation -> Language and Writing-system Romanisation Rules -> Other languages or systems not covered:
- Use a system common and recognisable.
I'd change the order around to "Use a common and recognisable system." to make it sound more natural.
-infinite-
Thanks for all the work! This is definitely a huge improvement. Some questions here. Indeed many of them are not raised from this rewrite, but I hope them being clarified at this chance (I guess many of them already have consensus in practice so they only require clarification):

Handling Symbols -> Rules
The following Unicode Symbol subsets should have leading and trailing spaces when they can be romanised:
What does the "can be romanised" mean? How to tell if a Unicode Symbol can be romanised? I think it meant "if it's not removed in the romanised field" (which, if I understand correctly, can be a mapper's preference in some cases)?


Source -> Rules
The Source field must be used, if the song...
directly originates from or is tied to a piece of media, except for albums and hosting websites.
I kinda like the original RC where some examples of "media" is listed (such as a video game, movie, series, event, etc.) since "media" itself is a very abstract concept

Tags -> Rules
Tags must include the following items when applicable:
At least one song genre and one language tag.
For instrumental tracks, instrumental is the language tag.
Could we have a definition/guideline on what is an "instrumental" track? e.g. which is correct, beatmapsets/1807003 , beatmapsets/1501157 , or either is fine? btw, I didn't see any mention of the genre and language setting on the website?

Romanisation -> Rules
Loan words must use the source language's spelling when romanised.
what qualifies for a "loan word"? e.g. should 上海/シャンハイ be Shanhai or Shanghai(上海 in Chinese),トンネル be tonneru or tunnel? (tbh idk how this could be imposed in practice since languages borrow words from each other like every year, and it can be hard to tell if a word is borrowed or already "localized")

When the song uses repeat words in the title or artist where one is unicode, and the other is a romanisation, the romanised field must use the romanisation only and remove the duplicate word.
what does this mean? Let's say a song is named "にゃん! Nyan!", should it be romanised to "Nyan!", or "Nyan! Nyan!", or "! Nyan!"(wtf)? idk but I guess this rule is intended for very specific cases and not used in this way?

Japanese
Capitalise following title case1, ...
it would be nice if there are examples on common words in Japanese that should not be capitalized.
show more
Please sign in to reply.

New reply