forum

[Proposal] Metadata section overhaul

posted
Total Posts
216
Topic Starter
Okoratu
Hi~

this is the proposal as merged from both t/595864 t/632681 with the help of Noffy and tries to incorporate both proposals while respecting the discussions going on in either thread while unifying both to a standard that is simple to understand and follow-able (ideally like the rest of the Ranking Criteria)


We also tried to include t/687064 , but that thread has stopped moving without any conclusions whatsoever (quite to the contrary, the conclusion seems, like the proposed way just doesnt work at all)

Debate on this will be open for two weeks ending on 05.04.2018 (dd.mm.YY)

Thanks for reading, have fun discussing!
Shiguma
I believe that TV size cuts of songs should have the (TV Size) label on them, regardless of the official source. My reasoning for this is, when you search up a song, having the (TV Size) in the metadata won't affect searching for that song, while also making it very clear that whichever set you are looking at is the short version of a song. If we're bringing common sense into metadata, I don't see why we shouldn't do this.
Pachiru
"Guest mappers and storyboarders must be added to the tags of a beatmap set." People who makes the hitsounds should be added to the tags aswell or it's not mendatory?
Vacuous
If a track has more than 5 artists they must be substituted with Various Artists, similarly if a track is composed of 3 or more individual tracks, the title must be substituded to <Descriptor> Compilation.
What about songs with multiple parts? Would compiling these together warrant it being called "[song title] compilation"? Does that mean that the metadata for this is wrong?
Bunnrei
the last two glossary entries seem less like glossary entries and could be added in the rules/guidelines

The artists of a song must be tracable to existing people.
so something like the "Gorillaz" (a band with fictional members ran by two non-fictional people) would be unacceptable, or does that only apply to individual people?

Any form of vs. such as Vs., VS and the likes are to be written as vs. only.
does this apply even if the original metadata source uses the latter three

...and the likes are substituded to and asterisk
typo boi

everything else seems ok other than pachiru's point (could just be "People who made contributions within the map outside of modding must be added to tags" or smthng)
Topic Starter
Okoratu
@plus:
1 if you can trace the members of a band to real people then that is sufficient
2 always
3 oops

@Vacuous
as far as i understand what you said, this would be mislabeling the song, so yes

@Pachiru
open for debate on that one

@Shigu
fixed by putting asterisk in glossary and just making it default to star and rewording that :thonking:
Kurai
- The word 'Romanisation' should always be capitalised.

- Saying 'Songs with Russian metadata' is dumb as it exludes all other languages that use the Cyrillic script (there are some Ukrainian songs ranked for example).

- Cyrillic Romanisation should follow the BGN/PCGN system (except for the letter ё in Russian which should follow the GOST 2002(B) system). Read more here: http://up.kuraip.net/032209ex3724.pdf

- 'If the song title includes any denoting tag that it is a TV sized cut of a song, use a standard (TV Size) tag in its place in the end of the current title string.'
→ It was decided that TV Size tags should be removed from the titles and put in the tags.

I'll probably go more in depth later!
Topic Starter
Okoratu
→ It was decided that TV Size tags should be removed from the titles and put in the tags.

by? i mean im fine with either but just saying you decided this doesnt invalidate my suggestion of doing it? Especially when you can argue that the more obvious marker you put in there the easier it is for people to discern what version of a song they're getting without having to bother with reading the tags.

the rest is fair enough and was ported from kwan proposal for the most part so i'll adjust it tomorrow, this took way longer and i wanna do something different for today
Sieg
and here we go again with Cyrillic Romanisation
p/6037577
Kurai
@Oko: That was the consensus we reached in the Metadata server after a long discussion on the matter. Please check #guidelines_discussion and search for 'TV Size', you should be able to find the conversation. Invalidating something that has already been put into practice is a bit weird.
J1NX1337
Brackets within artist or title fields should be separated from the other text surrounding it, unless there is obvious reason to not do so.
-> unless there is an obvious reason not to do so.

Slight grammar correction.
Topic Starter
Okoratu
this is a public discussion.
discuss in public i wont go through all channels i can find to recollect what may and may not have happened / occured / agreed on in order to make sense of your argument.

present the argument, debate again, the point is up.

will apply whatever the russian metadata thing ended up in and saying that i find Shiguma's point reasonable, actually
Pachiru
For tags, from draft: "This is to give credit where credit is due and helping others identify the main contributers of any given beatmap set."

Since we should give feedback to the main contributors of the set, we need to put the person who made the hitsounding, which is an important part of the map, because without hitsounds the map is unrankable. If we give credit to someone who made something optional like Storyboard, we should add the hitsounds person to the tag aswell.

(also, don't forget to fix the typo on: "contributers" → "contributors")
Sieg
alright
Cyrillic Romanisation: Use BGN/PCGN system for Russian/Cyrillic. Е and е should be romanised as ye if it stands alone or after a, e, ё, и, о, у, ы, э, ю, я, й, ъ, ь. In other cases, it should be romanised as e. ё should be romanised to ye, however, use yo or o to avoid usage of special characters. Ignore any other rules in the file provided, these are either irrelevant or wouldn't help in the game. If an artist uses a preferred romanisation, follow it regardless of this rule. For most of the other characters, refer to the first page of this document.
I suggest to remove "Cyrillic" and leave it to case by case since current wording is ambiguous trying to cover all Cyrillic languages with different phonetics and different scenarios of romanisation, which is obviously wrong approach. So, reword it to

Russian Romanisation: Use BGN/PCGN system for Russian. Е and е should be romanised as ye if it stands alone or after a, e, ё, и, о, у, ы, э, ю, я, й, ъ, ь. In other cases, it should be romanised as e. ё should be romanised to ye, however, use yo or o to avoid usage of special characters. Ignore any other rules in the file provided, these are either irrelevant or wouldn't help in the game. If an artist uses a preferred romanisation, follow it regardless of this rule. For most of the other characters, refer to the first page of this document.
Delis

Shiguma wrote:

I believe that TV size cuts of songs should have the (TV Size) label on them, regardless of the official source. My reasoning for this is, when you search up a song, having the (TV Size) in the metadata won't affect searching for that song, while also making it very clear that whichever set you are looking at is the short version of a song. If we're bringing common sense into metadata, I don't see why we shouldn't do this.
this. since ive never had a single ranked map of tv size myself i dont know how it would feel like but both tv size and full ver being mixed into one title is quite frustrating as a player. i dont remember when/why exactly it happened, although we'd better bring it back as a lot of the people back then didn't agree on the current rule about tv size (which we have to rely on whether officials have a track with tv size in its track name or not) being finalized.
Nevo

Pachiru wrote:

"Guest mappers and storyboarders must be added to the tags of a beatmap set." People who makes the hitsounds shouldn't be added to the tags aswell or it's not mendatory?
I definitely feel people who make hitsounds for a set should be required in tags, since they do play a large role in mapsets, also because a lot of hitsounders put forth plenty of time and effort into their work.
Nao Tomori
agre w/ nevo abt hitsounders in tags.

i think forcing tv size is a bit stupid since "cut ver." for example isn't forced if it's not official yet it serves the exact same purpose. it takes players about 3 seconds to see if a map is tv size or not, and in many cases dropping it makes the title look much cleaner anyway.
Pachiru

Nevo wrote:

Pachiru wrote:

"Guest mappers and storyboarders must be added to the tags of a beatmap set." People who makes the hitsounds shouldn't be added to the tags aswell or it's not mendatory?
I definitely feel people who make hitsounds for a set should be required in tags, since they do play a large role in mapsets, also because a lot of hitsounders put forth plenty of time and effort into their work.
Yes, I've mispelled one word, I gave my opinion on this down, and it meet yours :)
jeanbernard8865

Nevo wrote:

Pachiru wrote:

"Guest mappers and storyboarders must be added to the tags of a beatmap set." People who makes the hitsounds shouldn't be added to the tags aswell or it's not mendatory?
I definitely feel people who make hitsounds for a set should be required in tags, since they do play a large role in mapsets, also because a lot of hitsounders put forth plenty of time and effort into their work.
i think the rule should be extended to anyone whose contribution is visible in the mapset ( for example, someone who provided the mp3 for a map or a modder will not be in tags, but someone who keysounded a section in the top diff will )
ailv
If a track has more than 5 artists they must be substituted with Various Artists, similarly if a track is composed of 3 or more individual tracks, the title must be substituded to <Descriptor> Compilation.

What about albums? E.g If I were to map https://soundcloud.com/owslaofficial/se ... nctuary-ep the entirety of this ep mixed into one track, I can't see why I'd have to name it "Sanctuary EP Compilation" over just "Sanctuary EP".


Songs with German umlauts (ü, ö, ä and ß) must be romanised into two-letter combination (ue, oe, ae and ss).
This doesn't make sense in certain cases, for example https://osu.ppy.sh/s/723626 wouldn't make sense as the "Ü" is simply a stylizing choice, it's intended to be a U with some fancy shmancy stuff.

Commas, vs., &, feat./ft., CV: must always use a trailing whitespace. Unless it is a comma, leading whitespace is also required.
does this include "ft" or "feat" without a "."

If a song or artist are referred to in multiple ways on official sources provided by the artist, the mapper is free to choose any of the romanisations. The only exception to this is if the song already has a mapset in the Ranked Section, in which case the corresponding guideline applies to it.
How does this apply to https://osu.ppy.sh/p/beatmaplist?q=diao%20ye%20zong which has had maps ranked under "Diao Ye Zong" and "RD-Sounds".

Only use the Source field if the song comes from, is remixed from or specifically fan-made for a video game, movie, or series. Website names are not an acceptable Artist nor Source.
Does this include if the song was popularized by a specific game/movie/series? I beleive that's how it's being handled rn as well.

Guest mappers and storyboarders must be added to the tags of a beatmap set. This is to give credit where credit is due and helping others identify the main contributers of any given beatmap set.
Does this need to be extended to previous usernames?

For songs belonging to doujin circles, the circle name must be used over the vocalist or composer, unless these contributors are not part of the circle. In these cases the priority falls on vocalist followed by composer for instrumental songs.
I'm not sure if I'm interpreting this wrong, but shouldn't it be more something like guest composer + circle, to give accurate credit in those cases?


Special unicode characters must be filtered to their nearest standard equivalent or removed from the Romanised Artist and Romanised Title fields within a .osu file. ★ ☆ ⚝ ✩ ✪ ✫ ✬ ✭ 🟉 🟊 ✮ ✯ ✰ and the likes are substituded to an asterisk. Corner Brackets have to be written as quotation marks instead. Other special characters are to be romanised or dropped on case-by-case basis.
I believe there should be some clarification between ' ' and " ", being used for romanization of 「 」, see https://osu.ppy.sh/s/735097, https://osu.ppy.sh/s/538136. Since British and American ways differ.
Sieg

ailv wrote:

Only use the Source field if the song comes from, is remixed from or specifically fan-made for a video game, movie, or series. Website names are not an acceptable Artist nor Source.
Does this include if the song was popularized by a specific game/movie/series? I beleive that's how it's being handled rn as well.
Also, since this is enforced rn, I think there should be some sort of indication that source is a must even for remixes, covers, whatever if original comes from vg, movie, series etc. Because atm it sounds like it's your choice to put it or not. Maybe someone can help with proper wording?
ailv
Oh adding on actually, I think there should be some clarification of what sources are acceptable too, https://osu.ppy.sh/s/729305 would allow both "東方Project" and "東方輝針城 ~ Double Dealing Character."

Something like, if your source is part of a large series, you may use either the specific game, or the series.
Topic Starter
Okoratu

Naotoshi wrote:

agre w/ nevo abt hitsounders in tags.

i think forcing tv size is a bit stupid since "cut ver." for example isn't forced if it's not official yet it serves the exact same purpose. it takes players about 3 seconds to see if a map is tv size or not, and in many cases dropping it makes the title look much cleaner anyway.
that's true, idk what to do about it for now because delis speaking for players has a point imo

Sieg wrote:

alright
Cyrillic Romanisation: Use BGN/PCGN system for Russian/Cyrillic. Е and е should be romanised as ye if it stands alone or after a, e, ё, и, о, у, ы, э, ю, я, й, ъ, ь. In other cases, it should be romanised as e. ё should be romanised to ye, however, use yo or o to avoid usage of special characters. Ignore any other rules in the file provided, these are either irrelevant or wouldn't help in the game. If an artist uses a preferred romanisation, follow it regardless of this rule. For most of the other characters, refer to the first page of this document.
I suggest to remove "Cyrillic" and leave it to case by case since current wording is ambiguous trying to cover all Cyrillic languages with different phonetics and different scenarios of romanisation, which is obviously wrong approach. So, reword it to

Russian Romanisation: Use BGN/PCGN system for Russian. Е and е should be romanised as ye if it stands alone or after a, e, ё, и, о, у, ы, э, ю, я, й, ъ, ь. In other cases, it should be romanised as e. ё should be romanised to ye, however, use yo or o to avoid usage of special characters. Ignore any other rules in the file provided, these are either irrelevant or wouldn't help in the game. If an artist uses a preferred romanisation, follow it regardless of this rule. For most of the other characters, refer to the first page of this document.

J1NX1337 wrote:

Brackets within artist or title fields should be separated from the other text surrounding it, unless there is obvious reason to not do so.
-> unless there is an obvious reason not to do so.
Slight grammar correction.
ok

Kurai wrote:

- Cyrillic Romanisation should follow the BGN/PCGN system (except for the letter ё in Russian which should follow the GOST 2002(B) system). Read more here: http://up.kuraip.net/032209ex3724.pdf
can some russians say something about http://up.kuraip.net/032209ex3724.pdf ? it seems to make sense and encompass all things said

Pachiru wrote:

For tags, from draft: "This is to give credit where credit is due and helping others identify the main contributers of any given beatmap set."

Since we should give feedback to the main contributors of the set, we need to put the person who made the hitsounding, which is an important part of the map, because without hitsounds the map is unrankable. If we give credit to someone who made something optional like Storyboard, we should add the hitsounds person to the tag aswell.

(also, don't forget to fix the typo on: "contributers" → "contributors")
sure. fixing this

ailv wrote:

If a track has more than 5 artists they must be substituted with Various Artists, similarly if a track is composed of 3 or more individual tracks, the title must be substituded to <Descriptor> Compilation.
What about albums? E.g If I were to map https://soundcloud.com/owslaofficial/se ... nctuary-ep the entirety of this ep mixed into one track, I can't see why I'd have to name it "Sanctuary EP Compilation" over just "Sanctuary EP".
suggest an alternative, this is a fair point


Songs with German umlauts (ü, ö, ä and ß) must be romanised into two-letter combination (ue, oe, ae and ss).
This doesn't make sense in certain cases, for example https://osu.ppy.sh/s/723626 wouldn't make sense as the "Ü" is simply a stylizing choice, it's intended to be a U with some fancy shmancy stuff.
limiting it to romanisation of german then because german words are affected

Commas, vs., &, feat./ft., CV: must always use a trailing whitespace. Unless it is a comma, leading whitespace is also required.
does this include "ft" or "feat" without a "."
now it does

If a song or artist are referred to in multiple ways on official sources provided by the artist, the mapper is free to choose any of the romanisations. The only exception to this is if the song already has a mapset in the Ranked Section, in which case the corresponding guideline applies to it.
How does this apply to https://osu.ppy.sh/p/beatmaplist?q=diao%20ye%20zong which has had maps ranked under "Diao Ye Zong" and "RD-Sounds".
if the song youre mapping has 2 ranked sets with both then you can go back to choosing, otherwise follow ranked unless that one was wrong

Only use the Source field if the song comes from, is remixed from or specifically fan-made for a video game, movie, or series. Website names are not an acceptable Artist nor Source.
Does this include if the song was popularized by a specific game/movie/series? I beleive that's how it's being handled rn as well.
example needed?

Guest mappers and storyboarders must be added to the tags of a beatmap set. This is to give credit where credit is due and helping others identify the main contributers of any given beatmap set.
Does this need to be extended to previous usernames?
does it say so? no.

For songs belonging to doujin circles, the circle name must be used over the vocalist or composer, unless these contributors are not part of the circle. In these cases the priority falls on vocalist followed by composer for instrumental songs.
I'm not sure if I'm interpreting this wrong, but shouldn't it be more something like guest composer + circle, to give accurate credit in those cases?
usually when that happens the song is a guest on an album anyways but im not sure myself more debate required.


Special unicode characters must be filtered to their nearest standard equivalent or removed from the Romanised Artist and Romanised Title fields within a .osu file. ★ ☆ ⚝ ✩ ✪ ✫ ✬ ✭ 🟉 🟊 ✮ ✯ ✰ and the likes are substituded to an asterisk. Corner Brackets have to be written as quotation marks instead. Other special characters are to be romanised or dropped on case-by-case basis.
I believe there should be some clarification between ' ' and " ", being used for romanization of 「 」, see https://osu.ppy.sh/s/735097, https://osu.ppy.sh/s/538136. Since British and American ways differ.
then clarify instead of saying that?
CrystilonZ
The same applies to the Source field if a romanised Source is preferred by the mapper.
I don't believe romanised source is appropriate tbh. Stuff inside the source field should be in its original language since that field does not limit usable characters to only stuff found on normal english keyboards. If the mapper wants the english title inside that field he/she can only do so only if there is an official english title.

https://osu.ppy.sh/forum/p/6549402 << this isn't dead orz. Everyone just got incredibly busy lol. For the current state of that proposal there are some counterarguments for it but most are just pure fallacy or stuff that we've replied to already. I can summarise the thread and post it here if you want to. I'd love to see Chinese romanisation changed.
Topic Starter
Okoratu
updated draft, btw

2 months of no one doing anything is pretty dead
also the source is a unicode field as such it can hold anything we want it to -> the mapper should have the choice to decide which one is shown in the client
ailv

ailv wrote:

If a track has more than 5 artists they must be substituted with Various Artists, similarly if a track is composed of 3 or more individual tracks, the title must be substituded to <Descriptor> Compilation.
What about albums? E.g If I were to map https://soundcloud.com/owslaofficial/se ... nctuary-ep the entirety of this ep mixed into one track, I can't see why I'd have to name it "Sanctuary EP Compilation" over just "Sanctuary EP".
suggest an alternative, this is a fair point

Compilation should be used in cases where the songs are not already part of a organized set of songs.


Songs with German umlauts (ü, ö, ä and ß) must be romanised into two-letter combination (ue, oe, ae and ss).
This doesn't make sense in certain cases, for example https://osu.ppy.sh/s/723626 wouldn't make sense as the "Ü" is simply a stylizing choice, it's intended to be a U with some fancy shmancy stuff.
limiting it to romanisation of german then because german words are affected

Can we state that this appleis speficially to Songs with a German title or artist fields then? Currently the wording implies that any song with those german umlaut characters would need to be romanized as such.

Guest mappers and storyboarders must be added to the tags of a beatmap set. This is to give credit where credit is due and helping others identify the main contributers of any given beatmap set.
Does this need to be extended to previous usernames?
does it say so? no.

I've seen discussion about this before, I think I've seen cases where the previous username was added to the online tags so a DQ wouldn't be needed. I think further discussion would be useful.

Special unicode characters must be filtered to their nearest standard equivalent or removed from the Romanised Artist and Romanised Title fields within a .osu file. ★ ☆ ⚝ ✩ ✪ ✫ ✬ ✭ 🟉 🟊 ✮ ✯ ✰ and the likes are substituded to an asterisk. Corner Brackets have to be written as quotation marks instead. Other special characters are to be romanised or dropped on case-by-case basis.
I believe there should be some clarification between ' ' and " ", being used for romanization of 「 」, see https://osu.ppy.sh/s/735097, https://osu.ppy.sh/s/538136. Since British and American ways differ.
then clarify instead of saying that? I don't know which one is better, i'd propose that bot are acceptable and up to choice.
Topic Starter
Okoratu
1 tried to clarify compilations

2 that is what "german metadata" encompasses

3 probably why these maps weren't dq'd

4 yea idk either
Noffy
ok time for a re-review with slightly fresher eyes

the thing part nine wrote:

Word-by-word Romanisation: Each character must be romanised into a single, capitalised, separated word. Refer to this thread for examples and supplementary information.
this isn't the same thread anymore and doesn't include that supplementary information section so having this doesn't make sense oops


the thing part thirty wrote:

Guest mappers, storyboarders, and hitsounders must be added to the tags of a beatmap set. This is to give credit where credit is due and helping others identify the main contributors of any given beatmap set.
-> + "Skinners should be added if they made the skin specifically for the mapset" (in contrast to someone just borrowing/mixing skin elements that're already out there) (this would be nice)


the thing part forty two wrote:

Commas, vs., &, any variations of feat./ft., CV: must always use a trailing whitespace. Unless it is a comma, leading whitespace is also required.
(CV: blah) vs. ( CV: blah) . the latter would look silly, so CV: shouldn't require leading whitespace either. Or uhhh... this doesn't apply to sides which have the inside of a bracket next to them? or something. since it'd also apply to like, (feat.) vs. ( feat. ) which isn't.. better really.. hmmm
I'm not sure how to fix the wording for this though
aaaaaaaa

Okoratu wrote:

4 yea idk either

Corner Brackets have to be written as quotation marks instead.
->
Corner Brackets have to be written as quotation marks instead, with either the British or American methods unless there is an official source specifying a specific method.


?
may be a bit long for what it's a part of though.


thanks oko~
ailv
If the creator of the mapset has done major edits to the .mp3, they are free to name it appropriately to signal that this song is a special version. In this case the original songs must still be clearly indicated in order for players to be able to search for the original songs.
How exactly would the line for "major edits" be drawn? I think this specific part requires additional discussion. I personally would suggest that a major edit constitutes that a given .mp3 is either edited to remove or add additional instruments? I'm not too sure here.

Corner Brackets have to be written as quotation marks instead.
Add an example of what a corner bracket is "⸤ ⸥" and " 「 」 " since there are multiple forms.

Other special characters are to be romanised or dropped on case-by-case basis.
Will these special characters be added to the rc? If not, I would suggest that as they appear on "case-by-case" they be updated.

Brackets within artist or title fields should be separated from the other text surrounding it, unless there is obvious reason not to do so. Reasoning like this would include syntactical use of brackets and the general typesetting of a song title or artist using them without whitespaces often and consistently across multiple platforms.
Can we clarify what the word "separated" refers to in this context? I think it makes more sense to explicitly state that separation should be done using whitespaces, unless there is an obvious reason to not do so. Otherwise title,[stuff] is technically separated by a comma.
Noffy

ailv wrote:

Corner Brackets have to be written as quotation marks instead.
Add an example of what a corner bracket is "⸤ ⸥" and " 「 」 " since there are multiple forms.
That's why it's in the glossary
ailv

Noffy wrote:

That's why it's in the glossary
oh shit im blind, add "⸤ ⸥" still though.
CrystilonZ

Okoratu wrote:

also the source is a unicode field as such it can hold anything we want it to -> the mapper should have the choice to decide which one is shown in the client
The point is that I don't consider the romanised source as being official. Why don't we just stick to the original language since it's clearly a better alternative? There's no need to romanise it to begin with
Topic Starter
Okoratu
oh there is, if you are english speaking and map japanese songs, the source in the top left ingame is unable to be translated at all to tell english speaking people what it is from without sounding cryptic to them
CrystilonZ
Properly crediting the source should take priority here imo. Like there are measures to increase metadata standards so that it credits stuff properly. Replacing titles (even they are in foreign scripts) with unofficial ones is not the best way to credit the source properly.
show more
Please sign in to reply.

New reply