forum

[Proposal] Metadata section overhaul

posted
Total Posts
216
show more
Shima Rin

abraker wrote:

Any thoughts about mapping style or patterns the maps have being in tags?
Don't you see that you are in the wrong topic.
Fycho
The main arguments are listed below:
  1. If we romanise Chinese title in word-by-word way(each character must be romanised into a single, capitalised, separated word) or generally every word should be separated and capitalised according to The Basic Rules of the Chinese Phonetic Alphabet Orthography.
  2. If using "yu" or "u" for the romanisation of the vowel "ü".
  3. If we need to distinguish dialects from Mandarin in romanisation.
For the first point, I recommend everybody has a read about ISO7098:2015 before sharing opinions, the romanisation of Chinese is much complex than others, which needs a lot of professional knowledges about Chinese. The new proposal can't stand “a word or phrase with double or more meanings”. For example, specific examples like "他谁都打不过", it's used intentionally to represent two meanings that are "Nobody can beat him" and "He can beat everybody", "Ta / Shui / Dou Da Bu Guo" and "Ta / Shui Dou Da Bu Guo". And it wouldn't be easier to be remember / read to Chinese / non-Chinese speakers. I am not going for detail, as someone would like to give more professional explanations.

For the second point, currently, "v" stands a lot. "ü" is one-word vowel, it works differently in pronunciation from two-words-vowel like "iu", "an", "ie", "üe", "ai", "ao", etc... We use "YU" for "ü" only in passport and other specific cases, because the passport require a captial letter about the name and "ü" doesn't have captial case. In other ways, there are still "v". For one-word vowel, "v" is the most common and familiar letter and it's officially supprted, and that is what the input keyboard uses in majority. I believe using "yu" for "ü" only makes it easier to read than "v" for non-Chinese speakers, but it's technically wrong, there aren't any other beneficial cases. The "u" of syllable "yu" is vowel "ü" actually and technically, but for "j / q / x / y / w", we use "u" for "ü", but it doesn't mean "u" can completely stand for "ü", and don't mean it's "yu" can stand for "ü", "y" isn't a vowel in Pinyin system at all, "y" is a consonant that has the same pronunciation as vowel "i", meanwhile "iu" and "yu" are completely two different things. In the pinyin system, "vowels" couldn't be made from "consonant". That means, By no means could "yu" become a two-word vowel, and could "yu" be used for romanisation which disobeying the language systems totally. "v" works best at the moment.

For the third point, is it necessary to distinguish dialects from Mandarin in romanisation. As all of us know, dialects are different in pronunciation, and some have different grammars. However, all the dialects don't have an official written format, and all the dialects do have a relation with Mandarin. A lot of Chinese characters words that are stand by all the dialects, like "好心分手", you can't know if it's Mandarin or Wu-Chinese or Cantonese unless someone pronounces it, but officially and technically we can't differ and figure out what it is, and it's just modern standard Chinese, and we romanise it in a standard way. Personally, I am a dialect-used person, and I can speak Wu-Chinese and Mandarin well. The major issue is there aren't any official way that we can write the dialect. This is because, It's not like the Japanese dialects, Japanese (Hirakana, Katakana) are same as lantin scripts, which are phonograms, however Chinese characters are ideographic and ideogram, this mades Chinese characters can't be used to represent the pronunciation to dialects, and decides that there wouldn't be any officially written form dialects, and there wouldn't be any song title that writes as dialects. There aren't any official published ways to romanise the pronuciation of dialects. Therfore it's unnecessary to distinguish dialects from Mandarin. By the way, if you are likely to say cantonese(Yue-Chinese), there isn't any official written form for cantonese as well, and in HK and Macau, the school teaches the standard Chinese written form, people personally like to type Yue-Characters in cantonese, which is more like a culture. It's not taught by the school officially. Enforcing something unofficial just makes us end up with endless discussions, that's why there isn't any official romanisation way until now, because we have already argued a lot in the real world, and haven't come out a conclusion. How can we romanise an independent language that even doesn't have a written format? I believe this is beyond out of the osu! community, and it's unnecessary to figure them out at the moment.

Current way of romanisation is a fair relatively way, which covers all the cases, and remain as a good result. It's not the best, but it's the most proper.
I've asked some Chinese-spoken QAT/GMT (Nardoxyribonucleic, spboxer3 and Zero__wind) for opinions about the proposal, and all of them think it's not necessary to revise the current romanisation rules about Chinese.
Hollow Wings
(maybe i'm not attentive enough... )

------------------------

CrystilonZ wrote:

For point 1. I just don't see how this is related to our discussion. " In automatic romanizing working progress"
1. if osu community don't use automatic or semi-automatic working progress, it'll be manual. and i just told you all things about why that progress is complex.
2. other alphabetic languages can be romanized automatically. so that's what osu community is doing.

you gonna let that complex work be done by Chinese osu staffs in manual? (you are not Chinese so you won't the one do it anyway.
i prefer just get rid of that and keep what we have: automanically romanisation with one by one words.

------------------------

CrystilonZ wrote:

2. Can you quote the exact words from the document? also all the reasons as stated in the standard as well. I couldn't read it while working on the proposal because 115 swiss franc is hella expensive.
omg... you even don't have any channel or way to read that document? then you may not know lots of concepts it mentions.
and if you don't read it, then you even didn't pass my previous post's precondition, that's bad news to me.
i still recommend you try hard to find a way to read that document.

so as you just so strict about that, i'll paste some part of that document. (but since it had copyright, i just paste text here but not original pictures.)

ISO7098:2015 said:
12 Automatic transcription for named entities

In the comuputer-assisted documentation, there are two approaches to automatic transcription for named entities, namely:

- fully automatics syllable transcription;

- rule-based and semi-automatic word transcription.
since you didn't read that document, i just wanna say that:
the main part of the document are just discussions about how to transcript proper nouns (or just "names") of places and persons.
that's what i summed up for that in previous post about ISO7098:2015.

and i emphasize this again: at international level, most of Chinese words are still transcripted into one by one characters in Latin characters of pinyin.
ISO7098:2015 just made a small step: make proper names combined.
the romanisation of Chinese in ISO is far more uncompleted.

i don't think osu commutity can do what ISO wasn't able to do.

------------------------

CrystilonZ wrote:

Read more about ideograms here. These are logograms. Modern Chinese characters are logographic.

A number of lines after this are about pinyin being a method of transcription. No comments there this is acknowledged since the beginning that this is just the way to pronounce stuff. And the next few lines are about Mandarin having a lot of homophones.
seems like you are really obsessed with concepts, maybe it's my bad to simplify those things.

then let me explain clearly: the ideogram i called Chinese character, is one of its property, like other ancient ones.

so called "logograms" is not Modern Chinese character's exact definition. let's see what ISO7098:2015 showed:

ISO7098:2015 said:
2.6
ideophonographical character

graphic character (2.6) that represents an object or a concept and is associated with a sound element in a natrual language.

EXAMPLE Chinese hanzi 鹤(crane), Japanese kanji 戦(war) and Korean hanja 册(book) are ideophonographical characters.
just mention: hanzi (Chinese), kanji (Japanese), and hanja (Korean) reads similar right? they all came from the common source: Chinese characters (汉字). and that is the exact "Chinese character" i pointed out at my prevous post as a ideogram.

and addtional knowledges here: you may know that, at the VERY FIRST, alphabetic characters are ideographic charcters as well. people comes later just get rid of their meanings and just use those characters as a tool to complete words, which didn't happen to Chinese.
(like you saw a character "m" and you may see nothing or you can see everything, that's not what Chinese characters do.

now we are clear to compromise with concepts: the Modern Chinese character is a kind of ideophonographic character.

(and also you may know that both ancient Chinese character and alphabetic character are ideographic character.)

------------------------

CrystilonZ wrote:

Hollow Wings wrote:

however, this is not reversible.
This is not exactly true. If it were Mandarin would have been dead a long while ago because the only way to communicate would be carrying a crap ton of paper with you at all time and write stuff when you want to communicate.
In English context it would be equivalent to you guys seeing or hearing /tīm/ (IPA stuff. This reads time). Intuitively the first thing that come into your heads would be the time. Tick-tock clocky stuff. However under different contexts:
"Can you buy me some /tīm/. I'm going to use it to cook dinner." In this case /tīm/ is the herb thyme.
"I don't have enough /tīm/ to do my homework. It's due tomorrow." In this case it's "time"
"Two /tīm/ two equal four." In this context it means multiply. 10/10 grammar.
As you can see they are reversible with context. And when you guys speak to each other you're actively tracing back to the original Hanzi characters using their pronunciation. Therefore, saying that it is not reversible is not true. It's harder in Mandarin (410 syllables - crap tons of words. Do the maths) but the fact that there are people speaking Mandarin proves the fact that it's possible.
lol

NONSENSE.

i think you still don't have enough cognition about how Chinese words and sentences can become.

again, you CAN'T simply know what those Chinese character exactly is, until you need to fully understand all of conponents both in and out of it.
if you just get the sentence without any other notice, you will never be able to do that, which means that sentence's meaning is various.

here are some examples:
a. one best example here, which shows that if you make mistake with it, you may got big trouble.
"Gu Niang, Shui Jiao Yi Wan Duo Shao Qian?"
this sentence mainly has two meanings:
1. "Hey gril, how much it costs if i buy a bowl of your dumplings?" (姑娘,水饺一碗多少钱?)
2. "Hey gril, how much it costs if i sleep you one night?" (姑娘,睡觉一晚多少钱?)
this widely happens in electric alphabet systems without tones, just like osu system.

b. a more common one here.
"Jie Dao Shou Zhang Zai He Shang"
this is too complex, i just do some transcription, and you may just do your mathematics mapping and see if you can figure out all of that sentence may means:
1. "Jie Dao" → 接到(catch/catch up/get/take/etc.), 街道(street/road/way/etc.), etc.
2. "Dao Shou" → 到手(already get sth./reach your hands/etc.), 倒手(transfer things between hands/buy in&out/left hand/etc.), etc.
3. "Shou Zhang" → 手掌(palm/people you trust/etc.), 首长(boss/highest level person/etc.), 收账(charge/blackmail/etc.), etc
...
oh hell, i won't continue.

this also happens even you have words separated:
"Jiedao Shouzhang Zai Heshang"
↑ maybe try your best to figure out what this means, and i can predict that you may find out at least 4 of meanings.

that's why i'm always saying why it's complex:
alphabetic characters can be transliterated immediately, even if you don't know what that word means.
and this won't work to ideophonographic characters, expecially Chinese characters.

and that's also the detail part of why it's not reversible.

c. some special meme here.
"Shi Shi Shi Shi Shi"
non-Chinese speakers may have no idea what's this.
but it's a popular article called "施氏食狮史" which is a best example to show how hard it may effect us to just read Chinese with only pinyin (or romanized Latin characters).

if you insist your opinion then try to figure out what this sentence means:
"Ji Ji Ji Ji Ji"
just mention: that's also a wonderful article in Chinese writing.

this is just one form of Chinese meme, there're tons of others in Modern Chinese.
like "爷爷", "不星", etc.
this is what general phenomenon in Chinese language environment and its romanisation like.

Latin-Chinese transcript is not reversible, is the exact truth.

------------------------

if you think separated words of pinyin in Latin characters as the romanisation of Chinese characters is better,
then you are wrong.

as a pure system, it's better ofc.
because it helps non-Chinese people read and understand.
and actually it's the very last of goal Chinese romanisation want to reach.

but in all of other sides, it sucks.
1. automanical works can't be done, so it need manual ones, which is tough and complex. you are not the one do it, so you won't understand.
2. you still need deep knowledge to know "what is a Chinese word" before you want to search some. that's just worse because it's harder.
3. osu staffs are not language specialists. they are the best at mapping or mapping checking works, but not at language area.
4. etc. (too much and just stop here

------------------------

the standard of Chinese romanisation is not even build up, "The Basic Rules of the Chinese Phonetic Alphabet Orthography" is just a tool to show rules we have a way do it, it doesn't mean we can really do it.
it's also why we call transcription for separated words of Chinese romanized words is "semi-automatic", becasue part of it is still manual, and will always be manual for a long time.
unless our AI tech is upgraded to a really high level that it can analyse that complex Chinese sentence, and do the rest just what alphabets languages' transliteration had already done. (or maybe you can just know some Chinese language specialists and pay them to do this work.

thou i've told you truths about how complex the separating work for Chinese words, here's some other ISO document conponents:

ISO7098:2015 said:
10.7 At present, in Chinese linguistics, there is no clear common definition of a Chinese word yet, so it is difficult to decide the boundary (dividing line) of a common Chinese word sometime, and, of course, it poses difficulty to link the monosyllables to form a common polysyllabic Chinese word.
sure metadata of osu maps is important, but this topic is far from what osu community could do.

waiting for next progress then.
Wafu
Warning: I will use a word "conservative" in this post. That doesn't refer to your political stance, but to your stance towards this issue.

conservative = a person who favors maintenance of the status quo
So, first of all, I'm quite surprised you are even trying to prove how Chinese is not logogram language. One says pictogram, one says ideogram, one says ideophonograph. I feel like you're trying to defend this so much that you have to take every single thing we said (even out of context, by the way) and simply make up something and say "do your research", while completely ignoring what we said. You, as Chinese have no priority in this matter, just because it's about Chinese, so stop acting like you are the better one and we know nothing because we didn't read document X, which nobody's even provided us before (and one of you in particular accusing others of not reading it, while misunderstanding it). Today's Chinese language is using only logograms. Origin of many of these logograms is pictographic or ideographic. You have to realise that something having pictographic/... features doesn't mean it's not logogram. I think you should know it, if you want to use it as an argument. What Hollow Wings said about this, by the way, is exactly proving that CrystilonZ was completely right on the fact that Chinese is logogram language, it's just that HW didn't understand the concept mentioned in the ISO file.

I think, as many of you have taken things out of context, I'll have to go through all the post and respond to them. I will be a bit more technical and in-depth. I hope you will not ignore and take it out of the context again.

Fycho wrote:

Below are that I disagree: (blue are my replies)

Wafu wrote:

  1. Current system
    1. Titles are easy read ✘ (most of people will read every syllable as if it was one word)
    2. Titles are easy to remember ✘ (words are easier to remember than separate syllables, humans remember the words easier by their shape)
    3. Fits the rules of Latin script (Romanisation = writing words from other script to Latin/Roman script) ✘ (Latin script is alphabetical, therefore separating each syllable doesn't make sense and doesn't read well for majority of Latin script)
    4. Fits the rules of Chinese script ✘ (Impossible, if you want to make it "fit" to the Chinese script, you would have to replace each character with one logogram, Latin alphabet doesn't have logograms. Chinese is also not syllabary script, so separating each syllable again doesn't make sense.)
    5. Differentiates between different Romanisations and meanings of the same sequences of characters. ✘
    6. Includes tones in Romanised text ✘ (Impossible with characters which we are limited to. You could use "a1", "a2" (redundant) etc., but that would make the text incomprehensible, majority of people wouldn't even know how to pronounce it)
    7. Doesn't replace characters with others which have no evidence of being similar to the intended character. ✘ (ü is replaced with v, which doesn't seem to be supported by any logical argument)
    8. You can use a different Romanisation system for dialects where the current system wouldn't work at all ✘
    9. Isn't related to politics ✘ (Impossible, picking any Romanisation system is picking a side, every Romanisation system is related to politics)
  2. Proposal
    1. Titles are easy read ✔ sorry but as a Chinese speaker I don't think they are not easier to read than current seperated romanisation format
    2. Titles are easy to remember ✔ They are not easier to remember than current seperated romanisation from a Chinese speaker side as well
    3. Fits the rules of Latin script (Romanisation = writing words from other script to Latin/Roman script) ✔Lantin script isn't fittable to Chinese, Pinyin system is much better from my side too
    4. Fits the rules of Chinese script ✘ (Impossible, if you want to make it "fit" to the Chinese script, you would have to replace each character with one logogram, Latin alphabet doesn't have logograms.)
    5. Differentiates between different Romanisations and meanings of the same sequences of characters. ✔explained above, for Chinese characters, Pinyin or any other romanisation system doesn't have any meaning at all, it's just a way to use as mark for Mandarin
    6. Includes tones in Romanised text ✘ (Impossible with characters which we are limited to. You could use "a1", "a2" (redundant) etc., but that would make the text incomprehensible, majority of people wouldn't even know how to pronounce it)
    7. Doesn't replace characters with others based on no evidence that they are similar to the intended character. ✔ as the pinyin or anyother romanisation/lantin letters (they are just used as mark as I said)doesn't stand for meaning, this feels unnecessary
    8. You can use a different Romanisation system for dialects where the current system wouldn't work at all ✔ Dialects don't have an official way of writing formally, even in HK, schools teach the Mandarin grammar and write standard Chinese grammar while only use cantonese as a pronuciation. There wouldn't be songs that use dialects as song titile and artist, so they don't need to be romanised at any case.
    9. Isn't related to politics ✘ (Impossible, picking any Romanisation system is picking a side, every Romanisation system is related to politics)
tl;nr: Current romanisation of Chinese is a fair enough way, I don't think it needs to be revised.
  1. 1. As for the readability point, you should probably know that Romanisation is not designed for Chinese people. It's not designed for people who use that language daily. You, I assume, are fluent in Chinese. That means you know how to read this language and your ability to read it either way is in no way impaired. Romanisation is for people who use Latin script as their primary writing script. Because such languages separate words (not syllables), these people are used to reading them, so reading something what appears to resemble a whole word is much easier for them to comprehend and read. If you want, we can hook up some people who have no knowledge of Chinese (and use Latin script), ask one group to read some song titles in the current system, and the other group, to read it with the other system. Obviously, both groups will pronounce it poorly, but the latter one will be slightly less robotic (I assume Chinese people don't like to hear the "ching chong" epithet. This is one of the reason it exists, I don't understand how you actually don't see it), they will actually read it more continuously, rather than with a gap between each syllable.
  2. 2. As for the memory point, again, you are considering this point from the Chinese speaker perspective. That's not the target group. As above, it's about how Latin script works with words. As you probably know, when people who use Latin script read longer words, they generally don't read them, they just recognize it by the shape of the word. Because of that, they will also miss minor spelling errors, because they read the originally intended word by the shape. That suggests (which is a fact by the way) that they memorize text (that is seemingly a word) much easier than syllables. As an example, you probably have the shape of "Romanisation" memorized pretty well. That means if I'd misspell it to "Ronamisation", you would quite likely not notice that. Whereas if I did "Ro Na Mi Sa Ti On", you would more likely notice the error, because you would read it syllable by syllable. You can look up information about this pretty easily. Again, this can be easily proven the same way as 1. We could just find some people for a memory test, one group given the syllable version, the other one the seemingly "words".
  3. 3. Non-sense. I think, if you are interested in Romanisation, you should know that it's not "Lantin", it's "Latin". Could make some assumption, but I won't accuse anyone of not doing research as some of you have. The point below exactly explains that it's impossible to fit to Chinese script, no Romanisation/transcription/transliteration system can do it. If you think we shouldn't be using Latin, you may have problem with current Romanisation system. (It's fully based on Latin script, no other script is allowed in the Romanised title/artist field)
  4. 4. That's not at all what I'm talking about. And you are plain making stuff up.
  5. 5. How does this make sense? The Romanisation systems we are currently using in the proposal all have very similar reading, in fact as similar to each other as possible. We care about the way people will read that. What you are saying here is that "it doesn't matter if some characters are represented with wrong character". You are literally saying that it wouldn't matter if we romanise 大 as "Da" or as "Wx". The ü character was suggested as "u", because it sorta sounds like a "swallowed" hard "u", tiny bit similar to German. That's why it made sense (whether that is the best option is up to discussion, but not up to denial for a reason that is "most keyboard layouts have v here", there's no linguistic basis behind v, that makes the current system completely invalid)
  6. 6. There's enough major evidence that the current system doesn't work. This is only a conservative stance that you guys have, so that no change happens, because that's the way you are used to it. You should be open to the fact that old things are not always the best and some things just have to be replaced at some point.
Regarding the "how can someone know that this is one word, they don't speak Chinese" argument, we do the same for Japanese Romanisation, that's also one of the reasons we wanted it to be more consistent. Both of these languages have conflicting "words", for some reason, I've never seen anyone complaining about "complicated" Japanese Romanisation, I've mostly seen people having no problems reading, memorizing and saying any Japanese title just from the Romanisation. It's not about knowing that this is a word, it's about the impression that it's a word, which leads to many aforementioned benefits for people who use Latin script.

Hollow Wings wrote:

OK, what a mess.

just wanna warning: my post will be long.
check it as detail as you can to know about Chinese language and its romanisation, if you wanna get involved into this.

A. Important things about Chinese Romanisation

I. "ISO 7098:2015".
1st thing of all, know things about ISO 7098:2015 as much as you can.
ISO 7098:2015 explains the principles of the Romanization of Modern Chinese Putonghua (Mandarin Chinese), the official language of the People's Republic of China as defined in the Directives for the Promotion of Putonghua, promulgated on 1956-02-06 by the State Council of China. This International Standard can be applied in documentation of bibliographies, catalogues, indices, toponymic lists, etc.
all contents in this document are important, you may know some before. and there's two parts i wanna specially mention for you, they are like:
1. In automatic romanizing working progress, there're two ways for Chinese Romanisation:
a. semi-automatic romanisation from Chinese words separated by following proper rules.
b. automatic romanisation from Chinese characters one by one.
2. During this period of time, most of other countries aside of PRC can't fully accept that romanizing Chinese characters into separated words according to combinations between Chinese characters, because the works of finding and dealing with the concept of Chinese words are complex, also the grammar of Chinese sentence can even blur it.
after thousand of thoughts, they decide to do the romanization work from Chinese characters one by one.


↑ this is my opening, just mark it and go on.


II. How special Chinese is as a kind of language.

according to the way characters comprise words, languages can be divided into alphabetic language and ideographic language, with alphabet and ideogram as their own characters.

a. alphabetic language is simple, most of you can easily know its concept. also, most of languages exist now, are alphabetic language. they are comprised with proper alphabet of their own. as i known:
  1. Cyrillic alphabet (eg. Russian)
  2. Hebrew characters (eg. Hebrew)
  3. Arabic alphabet (eg. Arabic)
  4. Armenian character (eg. Armenian)
  5. Georgian character (eg. Georgian)
  6. Old Geez abjad (eg. Old Geez) ←already dead
  7. Devanagari script (eg. Sanskrit)
  8. Tamil alphabet (eg. Tamil)
  9. Kana script (eg. Japanese)
  10. Hangul script (eg. Korean)
  11. Thai script (eg. Thai)
  12. Tibetan script (eg. Tibetan)
  13. Mongolian script (eg. Mongolian)
... and tons of other alphabetic languages which may not be widely used or just dead.
b. ideographic language is like, every single character was born from some exact thing or matter, this is very different from alphabetic language.
however, as i known, language that is ideographic language are:
  1. Egyptian hieroglyphs (eg. Ancient Egyptian) ←already dead
  2. Cuneiform script (eg. Ancient Sumerian) ←already dead
  3. Seal hieroglyphs (eg. Ancient Indian) ←already dead
  4. Maya hieroglyphs (eg. Ancient Mayan) ←already dead
  5. Chinese characters (eg. Chinese)
and NO MORE.
if you want to know why language system is like that, then that's a long story, i wont start telling them here.
the reason i pick up those truth above, is because i want you guys know the chinese language's specificity and leading to how different romanisation is done between alphabetic language and ideographic language.


III. “Transliteration” and “Transcription”
(Wafu: I have to shorten this because of character limit.)

B. Relation to osu community nomination system

CrystilonZ wrote:

Other languages that use the Chinese script are irrelevant to this proposal.
We are only talking about Standard Mandarin here and Mandarin is not equivalent to Chinese.
We only use 'Chinese' in the draft for simplicity. The wording will be changed if this is implemented.
↑i don't know if CrystilonZ know the whole Chinese language family clear enough, so i'll add some additional things as basic background knowledge here.
ISO 639 code sets
Documentation for ISO 639 identifier: zho
Identifier: zho
Name: Chinese
Status: Active
Code sets: 639-2/T and 639-3
Equivalents: 639-1: zh
639-2/B: chi
Scope: Macrolanguage
Type: Living
Denotation: See corresponding entry in Ethnologue.
The individual languages within this macrolanguage are
  1. Gan Chinese [gan] → 赣语
  2. Hakka Chinese [hak] → 客家话
  3. Huizhou Chinese [czh] → 惠州话
  4. Jinyu Chinese [cjy] → 晋语
  5. Literary Chinese [lzh] → 文言文
  6. Mandarin Chinese [cmn] → 官话(普通话)
  7. Min Bei Chinese [mnp] → 闽北话
  8. Min Dong Chinese [cdo] → 闽东话
  9. Min Nan Chinese [nan] → 闽南话
  10. Min Zhong Chinese [czo] → 闽中话
  11. Pu-Xian Chinese [cpx] → 莆仙话
  12. Wu Chinese [wuu] → 吴语
  13. Xiang Chinese [hsn] → 湘语
  14. Yue Chinese [yue] → 粤语
ok, so, things above are just for electric area. there're still lots of other native language in PRC.
and i just don't post PRC's official native language list here, in case make things more complex.

since people like CrystilonZ may insist that Mandarin Chinese is the main target and other Chinese systems have none business with it, let's start from the concept level of "macrolanguage":
it actually has a property of "same standard pronunciation and style of writing".
and to Chinease as the macrolanguage, its standard, is just Mandarin Chinese.


so the truth is, all Chinese language families DO has a common standard, and also with hundreds and thousands of connection to it. when you are talking about some other Chinese family menbers, it always be effected by Mandarin system, which is the exact center of the whole topic.
if you wanna get rid of every other Chinese language families, then you need to give another complete romanisation rule, to solve some problems may happened in transcription process. otherwise, Mandarin Chinese's is automatically an official solving way. in case of that, be shall be care about this one's effection to other Chinese language families.

and also, the so called "Cantonese" is actually a concept of "languages spoken in Guangdong Province“, contained "Min Zhong Chinese", "Hakka Chinese“ and "Yue Chinese". people just usually use its narrow sense of concept: almost regard "Cantonese" as "Yue Chinese".
what's more, native language spoken in Taiwan is a kind of Min Nan Chinese, in case some ignorant one jumps out.


with all those knowledges above, we can move on:

I. How to deal with Mandarin Chinese transcription with words from other Chinese language families, but also already became a part of it?

1. Chinese archaism

it's a part of Literary Chinese, but also become a part of Mandarin Chinese.
some of them even changed meaning, and it's hard to distinguish.
if Literary Chinese is regarded as another individual language aside of Mandarin Chinese, then when meet words like "空穴来风", "闭门造车", "人尽可夫", etc, how to deal with these?

2. multi-Chinese based songs

for example, there's a Chinese song called "好心分手", one of its version is sang by both Yue Chinese and Mandarin Chinese.
so Yue Chinese romanized version is "Hou Sam Fan Sau/Housam Fansou" (actually this is jupting, a special kind of pinyin)
and Mandarin Chinese romanized version is "Hao Xin Fen Shou/Haoxin Fenshou".
both of them are spoken exactly correct, then how to deal with these?

3. with Chinese families that no romanisation rules supported
for example, there's a Chinese song called "外滩18号", which is sang by three kind of Chinese language: Mandarin Chinese, Wu Chinese and "Southwestern Hakka" (an official native Chinese language of PRC).
so it can be romanized like:
Mandarin Chinese: "Wai Tan Shi Ba Hao/Waitan Shibahao"
Wu Chinese: "Nga Thae Tze Ba O/Ngathae Tzebao"
Southwestern Hakka Chinese: "Vai Tan Si Ba Hao/Vaitan Sibahao"
i'm not sure if those ones are correct (just typed here with searching dictionary of native romanisation) aside of Mandarin ones, but it can still have chance to have the romanisation of their own part, right?
then how to deal with these?


II. Even if we shall transcript Mandarin Chinese from separated words into Latin characters, who is the one help those mappers mapping a Chinese song?

it has some part:
  1. is this a Mandarin Chinese song?
    - maybe from official settings or sites, not a big deal. but will not do if you map some cult song.
  2. how to get the right romanized characters?
    - ask some Chinese staff/mapper/player? i doult any of them have time/ability to do it.
  3. how to make sure those things i got is correct?
    - some kind of same as the one above, if that person exsist and can do his job endlessly, he will be really welcomed to this system.
you may think most of Chinese words may not complex like that, but if you wanna build a reasonable system for rules, it should be strict.
and it's not you become the person who do this kind of work, you can hardly imagine if it's hard to do it or not.


C. Summary

I. Opinions

1. even international level groups can't do lots of romanisation for Mandarin-Latin transcription from separated words.
it's feasible, for it's truth. but it's efficiency is really really badly low.
Chinese staffs will be weary/tired out to death if they really do this. because as you see what i've explained, it's a tough work with a tough progress to do.

also i even can predict that someone wanna find a right answer of correct Mandarin romanisaton for month, and still dqed after he found the answer he got is still wrong. then it may block people mapping Chinese songs, personally i think that's really a bad news.

2. Mandarin Chinese and Cantonese has standard romanisation rules, but not other Chinese families. it's hard to complete one of you don't care all of them, for every single one of them has a common standard pronunciation and style of writing: Mandarin Chinese.

in case of that, rebuilding the Mandarin Chinese romanisation system in to a better and complete one will be a really hard work to do, and it's for sure out of osu community's range.

3. Chinese osu community already argued this for several times long time ago, and the result is still: keep the current state.

II. Conclusion

do romanisation from one by one Mandarin Chinese characters is the best way SO FAR.
until we find some genius invent a dictionary of Mandarin-Chinese-characters-Latin-characters romanisation, and upgrade the efficiency a lot more than current one.
and also, this is the exact thing what international groups do right now. (they only combine proper nouns like people's or place's name, etc.)

--------------

simple extra p.s. here:
to CrystilonZ, and other people who know little things about Chinese:

i think you had some wrong idea about Chinese characters, for i've seen written these:

CrystilonZ wrote:

Similar to Japanese, one Chinese character does represent one single syllable. However, a word is not necessarily comprised of one syllable (like Japanese, Chinese is a polysyllabic language).For example 图书馆 (túshūguǎn) as a whole means library, and writing 'li bra ry' would defeat the purpose of Romanisation by not resembling the structure of languages using the Roman alphabet.
Chinese is far different from Japanese. the syllable thing you are talking about may be just the differences between Japanese's Hiragana or Katagana, but not that true for Kanji part.
(btw, you may already know that a part of Japanese language system is just the exact Chinese.)

and now after reading all things i wrote above, you may know Chinese is not only a kind of polysyllabic language, but also the only living ideographic language.
"图书馆" reads "tú shū guǎn" and means "library", true.
However, "图书" reads "tú shū" and means "library book" or just “(picture) book", you ever know that?
this is far different from that you can't separate an English word in most cases: but you DO can separate a Chinese word, because every single character of Chinese can be a word.
eg.
图→graph, graphic, or lots of other meanings;
书→book, writing, letter, or lots of other meanings;
馆→shop, embassy, galleries or any building that showing something it wants to.

so, the one-character-one-word method is a solid reasonable metod for Chinese romanisation.

with knowledge of these, hope you can restructure your idea about Chinese, for helping you understand previous romanisation part.

--------------


hope all of these things could help you know more about Chinese romanisation.

also if you have any confusion about anything above, you are always welcomed to ask.
A
  1. 1. You have to remember that ISO being international doesn't mean that we have to follow ISO (After all we would be breaking many of their standards, even in the other Romanisation systems). There are many references and standards that are not ISO and are better in quality and design. It's not like taking one thing that benefits you is going to help. Regardless, as I've read this standard (and if you are concerned that I may not have done some research, this is a tiny part of what I've read to make my opinions on this issue, the research is much deeper than a single ISO document), I think it benefits you less than you think it does. In fact, from what arguments you are quoting, you don't seem to understand it very well on your own (or maybe you just expressed yourself poorly, but you misrepresent the document you are promoting here).
  2. 2. I already addressed this point in the previous post, but it was kinda taken out of the context. I already explained this in relation to Chinese script incompatibility. Anyway, what you're trying to say here is quite a non-sense. None of the listed languages are actually ideographic. All of these are logogram languages that partially use ideographic characters, but mostly characters that just originate from ideographic characters. The languages itself use logograms.
  3. 3. This has been addressed already. There's no reason why the missing reversibility factor would impair Latin script users from reading or memorizing this (that, however, is impaired by the current system which is reversible). It only has impact on people who actually can use Chinese language, these people do have a solution. The original title/artist is still present, so they don't need to reverse it to logograms. For Latin script users, there's no reason why they would want to reverse the text.

    This is the part that many of you have taken out of the context, and therefore misunderstood it. Again, Romanisation (internationally), is not designed for people who are fluent in Chinese, so there is no reason why they would want to convert it back to logograms. For those who are fluent in Chinese, you literally have it in the original title/artist.

    "Transliteration won't do, we need to do "Transcription"" argument doesn't make much sense. Not only, as I explained above, it's really not needed in this scenario, but at the same time, you are saying this and want to support a system that omits the phonetics? That literally makes it transliteration.

    "we chinese ourselves even cant understand what those words said in a short time, if they are all written in Latin characters of pinyin one by one" Don't make this up. If you can't understand Romanised text, it's because you can't process Latin script, again, there's the original title/artist for you to clarify it. You are not the primary target of the Romanisation. It's more important for a regular player to be able to memorize and read the title/artist, than for Chinese player to memorize, read and understand both the original and Romanised titles/artists.
B
  1. 1. Not worth talking about. This is a bit more out of topic and should already be clear from the previous discussion.
  2. 2. The same way we're doing it for Japanese. And we are doing it for Japanese. The "Metadata Heap" Discord server is quite a big one and people do solve issues here quite quickly and effectively. Even for the languages they don't know very well. You really underestimate this community if you think it will be only and only up to staff members. Sure, they have to recheck, but if this is discussed in the channel, they generally have a good starting point. So far, I haven't seen a problem that wasn't resolved there, it really shouldn't be that big of a drama that you make it look like (And yes, even QATs/GMTs are active here, but they don't do majority of the requests on their own). This is, therefore, not an issue.
C
  1. 1. 1. Same as for the the last point in B. This is not at all an issue. The second part of it, I don't think people take metadata DQs so negatively. I don't remember a single time it happened that people stopped mapping songs of certain language due to complicated metadata, even since metadata became more strict.
  2. 1. 2. Already addressed, this is not what should be discussed. Current system doesn't solve this even though you may imply it does. It doesn't.
  3. 1. 3. The result is "keep the current state" because of the conservative stance. From what I've seen, Chinese people argued for this system poorly and detached from the community that it's primarily about. Now the target community argues for something else, that doesn't mean you just keep conservative stance because we are not Chinese. The argument can never be that "it was discussed by Chinese community" because it's not only about you. We also don't say: "You can't judge most of the things because you don't know how Latin script languages work.", so don't do that to us.
  4. 2. Already explained in above paragraphs.

Regraz wrote:

Regarding the Romanisation of Mandarin, I would like to post my comments here.

Firstly I would like to start with the following proposal:

Proposed Rules wrote:

The ü vowel should be Romanised into u and all diacritical tone marks should be omitted because of the technical limitations resulting from the limited amount of characters allowed in the Romanised title/artist fields.

CrystilonZ wrote:

speaking about u and v here. v is just impossible to pronounce. I'm always open for a better alternative.
Please understand first, if you want to change the current rule, namely from ü to u, you have to prove yourself FIRST u is a better choice than v, instead of announcing you are going to change it to “u” while asking us to provide a better choice. There are plenty of letters and characters could be chosen, why you chose u? Just because they look similar after omitting the you called “diacritical tone mark”? I don’t think that is a reliable reason for this change as only judging by visual appearance is pretty unprofessional when talking about romanization. Additionally, Fycho has already mentioned the potential mess that might result from changing v to u, indicating that this entry within the proposal is not only pros. Therefore, prior to this discussion, you should not simply saying “The ü vowel should be Romanised into u…” and explain this change only by why “ü” cannot be implemented by the current system due to technical difficulties but to explain why “u” is better than “v” with valid reason, ( “u” can be pronounced is not a valid reason: there are many characters that could be pronounced, like a e I o and some bi-characters like yu, which is mentioned by Fycho. All of them have pros and cons, why do you gave preference to u in thisdraft?), as well as how you are going to address potential problems if this “u” proposal is implemented.

Again, if you would like to change the current criteria, try to form up solid reasons and show people why your proposal is better than the current. Saying “I am going to change this into that, if you don’t have better choices then this will be the new criteria.” sounds pretty irrelevant, illogic, and showing kind of manipulation toward criteria about Romanization of Mandarin.

I would like to proceed to comparison between current and proposed system in the previous discussion:

Previous Discussion wrote:

  1. Current system
    1. Titles are easy read ✘ (most of people will read every syllable as if it was one word)
    2. Titles are easy to remember ✘ (words are easier to remember than separate syllables, humans remember the words easier by their shape)
  2. Proposal
    1. Titles are easy read ✔
    2. Titles are easy to remember ✔
I don’t think with the proposal, titles are easier to read and remember.
How do you expect speakers who don’t know how to pronounce “ü“, “v” and “u” to differentiate syllables and words under Romanisation of Mandarin?

For non-Mandarin speakers, there are no differences regarding readability between “Wo De Wei Lai Shi” and “Wo de Weilaishi” or any other combinations like “Wode Weilai Shi”. They have no idea what is a syllable and what is a word. If you think words are easier to remember (you did not post any proof or research regarding this either), why can’t a player treat the syllables as words? Now that the player have no idea what you are reading is word or syllable. There are less syllables than words in total, they should be more easier to read and memorize!
  1. 1. The point at the beginning was already addressed. Again, taken out of the context. You read nothing. "v" had no linguistic basis whatsoever. The accuracy of the accent doesn't matter, even if you'd exactly hear it, if you can't speak Chinese, you won't pronounce it right. I explained why "u" has been chosen in this post. Again, nobody ever said this is the final decision. It's just that it sounds like deep and unvoiced "u", as if someone punched your chest, similar to German ü, but not the same. That's why it's "u", there was no other letter (with reasoning) suggested and "v" had absolutely no similarity to it. If you call it unprofessional, basing it on keyboard layout and not linguistics, to us, seems less professional.
  2. 2. Issues of both systems (which were more severe for current system) were already addressed in this post and also in the previous posts in the proposal thread that you didn't read, otherwise you wouldn't say there are no reasons.
  3. 3. Readability, we've been over it already. Even in this post.
  4. 4. Yes, there is a difference for non-Mandarin speakers. Explained it previously, and as I said to Fycho, we can do the tests with people that use Latin script if you want this kind of proof.
The rest is just an outrage and has already been explained.
"First, In formal Chinese writing, there is no logograms as well." is complete nonsense. There is no character in Chinese that is not a logogram.
"So the table of proposal in fact should be modified like this" is a part I don't understand. The tables in the proposal were to show what needs to be fixed in the first system if you want to make it work and why the proposed system solves majority of the issues that can be solved at this moment. It's not to show "how much better" the proposed system is. If you are angry because of a table, have to take it out of the context without reading the text related to the table and even just edit it this way without even giving specific reasons for doing so, I'd welcome if you would refrain from even giving your input here. We're trying to give as many reasons as possible and explain every single thing that is mentioned, unfortunately, some of you just can't take it seriously, yet want to talk about it.

Fycho wrote:

The main arguments are listed below:
  1. If we romanise Chinese title in word-by-word way(each character must be romanised into a single, capitalised, separated word) or generally every word should be separated and capitalised according to The Basic Rules of the Chinese Phonetic Alphabet Orthography.
  2. If using "yu" or "u" for the romanisation of the vowel "ü".
  3. If we need to distinguish dialects from Mandarin in romanisation.
For the first point, I recommend everybody has a read about ISO7098:2015 before sharing opinions, the romanisation of Chinese is much complex than others, which needs a lot of professional knowledges about Chinese. The new proposal can't stand “a word or phrase with double or more meanings”. For example, specific examples like "他谁都打不过", it's used intentionally to represent two meanings that are "Nobody can beat him" and "He can beat everybody", "Ta / Shui / Dou Da Bu Guo" and "Ta / Shui Dou Da Bu Guo". And it wouldn't be easier to be remember / read to Chinese / non-Chinese speakers. I am not going for detail, as someone would like to give more professional explanations.

For the second point, currently, "v" stands a lot. "ü" is one-word vowel, it works differently in pronunciation from two-words-vowel like "iu", "an", "ie", "üe", "ai", "ao", etc... We use "YU" for "ü" only in passport and other specific cases, because the passport require a captial letter about the name and "ü" doesn't have captial case. In other ways, there are still "v". For one-word vowel, "v" is the most common and familiar letter and it's officially supprted, and that is what the input keyboard uses in majority. I believe using "yu" for "ü" only makes it easier to read than "v" for non-Chinese speakers, but it's technically wrong, there aren't any other beneficial cases. The "u" of syllable "yu" is vowel "ü" actually and technically, but for "j / q / x / y / w", we use "u" for "ü", but it doesn't mean "u" can completely stand for "ü", and don't mean it's "yu" can stand for "ü", "y" isn't a vowel in Pinyin system at all, "y" is a consonant that has the same pronunciation as vowel "i", meanwhile "iu" and "yu" are completely two different things. In the pinyin system, "vowels" couldn't be made from "consonant". That means, By no means could "yu" become a two-word vowel, and could "yu" be used for romanisation which disobeying the language systems totally. "v" works best at the moment.

For the third point, is it necessary to distinguish dialects from Mandarin in romanisation. As all of us know, dialects are different in pronunciation, and some have different grammars. However, all the dialects don't have an official written format, and all the dialects do have a relation with Mandarin. A lot of Chinese characters words that are stand by all the dialects, like "好心分手", you can't know if it's Mandarin or Wu-Chinese or Cantonese unless someone pronounces it, but officially and technically we can't differ and figure out what it is, and it's just modern standard Chinese, and we romanise it in a standard way. Personally, I am a dialect-used person, and I can speak Wu-Chinese and Mandarin well. The major issue is there aren't any official way that we can write the dialect. This is because, It's not like the Japanese dialects, Japanese (Hirakana, Katakana) are same as lantin scripts, which are phonograms, however Chinese characters are ideographic and ideogram, this mades Chinese characters can't be used to represent the pronunciation to dialects, and decides that there wouldn't be any officially written form dialects, and there wouldn't be any song title that writes as dialects. There aren't any official published ways to romanise the pronuciation of dialects. Therfore it's unnecessary to distinguish dialects from Mandarin. By the way, if you are likely to say cantonese(Yue-Chinese), there isn't any official written form for cantonese as well, and in HK and Macau, the school teaches the standard Chinese written form, people personally like to type Yue-Characters in cantonese, which is more like a culture. It's not taught by the school officially. Enforcing something unofficial just makes us end up with endless discussions, that's why there isn't any official romanisation way until now, because we have already argued a lot in the real world, and haven't come out a conclusion. How can we romanise an independent language that even doesn't have a written format? I believe this is beyond out of the osu! community, and it's unnecessary to figure them out at the moment.


I've asked some Chinese-spoken QAT/GMT (Nardoxyribonucleic, spboxer3 and Zero__wind) for opinions about the proposal, and all of them think it's not necessary to revise the current romanisation rules about Chinese.
  1. 1. For the ISO document part, again, I want to mention this is not the only document that exists. It was taken into consideration during the creation of the proposal. But yes, it's generally good to read. I also already explained why it would be easier to remember, read and pronounce for the target group.
  2. 2. No, "v" doesn't work the best, it doesn't work at all because it has no linguistic basis. That doesn't mean "u" is the best, although we agreed that it generally won't make difference for a regular player, there are still many options that can be considered, but it can't be "v", and probably not "y", because that's associated with a different sound (even in other Romanisation systems we use). "u" pronounced in a certain way will result in the "ü" we are going for, it really is the core sound of it, I described this in the post 2 times already, so I guess I don't have to repeat myself.
  3. 3. It is sort of important. Because some things could be pronounced so differently that reading it regularly wouldn't be even close. Mostly, it would be just Mandarin and Cantonese, I don't believe any other dialect will be used in osu!. Anyway, sticking to the "official" systems shouldn't be considered bad. In this situation, it doesn't do any harm. In osu!, we'd preferably use whatever system works the best for us, based on the similarity to current Romanisation systems and other aspects already mentioned here (Latin script user readability etc.), it's best to keep the minority of languages up to case-by-case decision. (Most of the time, it would use this Romanisation system anyway, but in case we'd think it is needed, we would go for case-by-case, pretty common in these situations). Sticking to 100% official systems only delays us, it's a thing you have to deal with in case-by-case situations. The current Romanisation can't be officially applied to all dialects of Chinese either.
  4. 4. Again, this is a conservative stance that doesn't have any proper reasoning. You either have to solve things or replace them. Not admitting that current system doesn't work because it's easier or you're used to it shouldn't override all the arguments against it.

Hollow Wings wrote:

(maybe i'm not attentive enough... )

------------------------

CrystilonZ wrote:

For point 1. I just don't see how this is related to our discussion. " In automatic romanizing working progress"
1. if osu community don't use automatic or semi-automatic working progress, it'll be manual. and i just told you all things about why that progress is complex.
2. other alphabetic languages can be romanized automatically. so that's what osu community is doing.

you gonna let that complex work be done by Chinese osu staffs in manual? (you are not Chinese so you won't the one do it anyway.
i prefer just get rid of that and keep what we have: automanically romanisation with one by one words.

------------------------

CrystilonZ wrote:

2. Can you quote the exact words from the document? also all the reasons as stated in the standard as well. I couldn't read it while working on the proposal because 115 swiss franc is hella expensive.
omg... you even don't have any channel or way to read that document? then you may not know lots of concepts it mentions.
and if you don't read it, then you even didn't pass my previous post's precondition, that's bad news to me.
i still recommend you try hard to find a way to read that document.

so as you just so strict about that, i'll paste some part of that document. (but since it had copyright, i just paste text here but not original pictures.)

ISO7098:2015 said:
12 Automatic transcription for named entities

In the comuputer-assisted documentation, there are two approaches to automatic transcription for named entities, namely:

- fully automatics syllable transcription;

- rule-based and semi-automatic word transcription.
since you didn't read that document, i just wanna say that:
the main part of the document are just discussions about how to transcript proper nouns (or just "names") of places and persons.
that's what i summed up for that in previous post about ISO7098:2015.

and i emphasize this again: at international level, most of Chinese words are still transcripted into one by one characters in Latin characters of pinyin.
ISO7098:2015 just made a small step: make proper names combined.
the romanisation of Chinese in ISO is far more uncompleted.

i don't think osu commutity can do what ISO wasn't able to do.

------------------------

CrystilonZ wrote:

Read more about ideograms here. These are logograms. Modern Chinese characters are logographic.

A number of lines after this are about pinyin being a method of transcription. No comments there this is acknowledged since the beginning that this is just the way to pronounce stuff. And the next few lines are about Mandarin having a lot of homophones.
seems like you are really obsessed with concepts, maybe it's my bad to simplify those things.

then let me explain clearly: the ideogram i called Chinese character, is one of its property, like other ancient ones.

so called "logograms" is not Modern Chinese character's exact definition. let's see what ISO7098:2015 showed:

ISO7098:2015 said:
2.6
ideophonographical character

graphic character (2.6) that represents an object or a concept and is associated with a sound element in a natrual language.

EXAMPLE Chinese hanzi 鹤(crane), Japanese kanji 戦(war) and Korean hanja 册(book) are ideophonographical characters.
just mention: hanzi (Chinese), kanji (Japanese), and hanja (Korean) reads similar right? they all came from the common source: Chinese characters (汉字). and that is the exact "Chinese character" i pointed out at my prevous post as a ideogram.

and addtional knowledges here: you may know that, at the VERY FIRST, alphabetic characters are ideographic charcters as well. people comes later just get rid of their meanings and just use those characters as a tool to complete words, which didn't happen to Chinese.
(like you saw a character "m" and you may see nothing or you can see everything, that's not what Chinese characters do.

now we are clear to compromise with concepts: the Modern Chinese character is a kind of ideophonographic character.

(and also you may know that both ancient Chinese character and alphabetic character are ideographic character.)

------------------------

CrystilonZ wrote:

This is not exactly true. If it were Mandarin would have been dead a long while ago because the only way to communicate would be carrying a crap ton of paper with you at all time and write stuff when you want to communicate.
In English context it would be equivalent to you guys seeing or hearing /tīm/ (IPA stuff. This reads time). Intuitively the first thing that come into your heads would be the time. Tick-tock clocky stuff. However under different contexts:
"Can you buy me some /tīm/. I'm going to use it to cook dinner." In this case /tīm/ is the herb thyme.
"I don't have enough /tīm/ to do my homework. It's due tomorrow." In this case it's "time"
"Two /tīm/ two equal four." In this context it means multiply. 10/10 grammar.
As you can see they are reversible with context. And when you guys speak to each other you're actively tracing back to the original Hanzi characters using their pronunciation. Therefore, saying that it is not reversible is not true. It's harder in Mandarin (410 syllables - crap tons of words. Do the maths) but the fact that there are people speaking Mandarin proves the fact that it's possible.
lol

NONSENSE.

i think you still don't have enough cognition about how Chinese words and sentences can become.

again, you CAN'T simply know what those Chinese character exactly is, until you need to fully understand all of conponents both in and out of it.
if you just get the sentence without any other notice, you will never be able to do that, which means that sentence's meaning is various.

here are some examples:
a. one best example here, which shows that if you make mistake with it, you may got big trouble.
"Gu Niang, Shui Jiao Yi Wan Duo Shao Qian?"
this sentence mainly has two meanings:
1. "Hey gril, how much it costs if i buy a bowl of your dumplings?" (姑娘,水饺一碗多少钱?)
2. "Hey gril, how much it costs if i sleep you one night?" (姑娘,睡觉一晚多少钱?)
this widely happens in electric alphabet systems without tones, just like osu system.

b. a more common one here.
"Jie Dao Shou Zhang Zai He Shang"
this is too complex, i just do some transcription, and you may just do your mathematics mapping and see if you can figure out all of that sentence may means:
1. "Jie Dao" → 接到(catch/catch up/get/take/etc.), 街道(street/road/way/etc.), etc.
2. "Dao Shou" → 到手(already get sth./reach your hands/etc.), 倒手(transfer things between hands/buy in&out/left hand/etc.), etc.
3. "Shou Zhang" → 手掌(palm/people you trust/etc.), 首长(boss/highest level person/etc.), 收账(charge/blackmail/etc.), etc
...
oh hell, i won't continue.

this also happens even you have words separated:
"Jiedao Shouzhang Zai Heshang"
↑ maybe try your best to figure out what this means, and i can predict that you may find out at least 4 of meanings.

that's why i'm always saying why it's complex:
alphabetic characters can be transliterated immediately, even if you don't know what that word means.
and this won't work to ideophonographic characters, expecially Chinese characters.

and that's also the detail part of why it's not reversible.

c. some special meme here.
"Shi Shi Shi Shi Shi"
non-Chinese speakers may have no idea what's this.
but it's a popular article called "施氏食狮史" which is a best example to show how hard it may effect us to just read Chinese with only pinyin (or romanized Latin characters).

if you insist your opinion then try to figure out what this sentence means:
"Ji Ji Ji Ji Ji"
just mention: that's also a wonderful article in Chinese writing.

this is just one form of Chinese meme, there're tons of others in Modern Chinese.
like "爷爷", "不星", etc.
this is what general phenomenon in Chinese language environment and its romanisation like.

Latin-Chinese transcript is not reversible, is the exact truth.

------------------------

if you think separated words of pinyin in Latin characters as the romanisation of Chinese characters is better,
then you are wrong.

as a pure system, it's better ofc.
because it helps non-Chinese people read and understand.
and actually it's the very last of goal Chinese romanisation want to reach.

but in all of other sides, it sucks.
1. automanical works can't be done, so it need manual ones, which is tough and complex. you are not the one do it, so you won't understand.
2. you still need deep knowledge to know "what is a Chinese word" before you want to search some. that's just worse because it's harder.
3. osu staffs are not language specialists. they are the best at mapping or mapping checking works, but not at language area.
4. etc. (too much and just stop here

------------------------

the standard of Chinese romanisation is not even build up, "The Basic Rules of the Chinese Phonetic Alphabet Orthography" is just a tool to show rules we have a way do it, it doesn't mean we can really do it.
it's also why we call transcription for separated words of Chinese romanized words is "semi-automatic", becasue part of it is still manual, and will always be manual for a long time.
unless our AI tech is upgraded to a really high level that it can analyse that complex Chinese sentence, and do the rest just what alphabets languages' transliteration had already done. (or maybe you can just know some Chinese language specialists and pay them to do this work.

thou i've told you truths about how complex the separating work for Chinese words, here's some other ISO document conponents:

ISO7098:2015 said:
10.7 At present, in Chinese linguistics, there is no clear common definition of a Chinese word yet, so it is difficult to decide the boundary (dividing line) of a common Chinese word sometime, and, of course, it poses difficulty to link the monosyllables to form a common polysyllabic Chinese word.
sure metadata of osu maps is important, but this topic is far from what osu community could do.

waiting for next progress then.
  1. 1. Again, nothing is going to be "up to Chinese staff", they don't even communicate about metadata much. It doesn't need to be automatic. Japanese is also not automatic. Your assumption that people who are not Chinese can't do Chinese metadata is incorrect. There are people who do it (e.g. in the aforementioned Discord server for metadata), some do it more reliably than Chinese people. There are many people who look up Japanese characters one by one and they are very accurate with it, even though you have to think of exceptions (there are even Chinese alternatives that are more accurate, btw.). This is not a problem again, unless you assume that where you are born determines what metadata you can do, which is not true.
  2. 2. I don't understand why you would enforce some "preconditions" in your post. I understand you want people to do some research, which is okay, but remember it works vice-versa.
  3. 3. Ancient Chinese was not an ideogram language. You've also proven CrystilonZ right, you just used different terms.
To your point about us being "wrong". You don't know what Romanisation is about. The idea that Chinese Romanisation's goal is not at all to help non-Chinese people is non-sense. We are not in China where you don't care. We are in osu!, which is an international game. Concept of Romanisation in China is different than Romanisation worldwide (target = people using Latin script, this is what we need to use, otherwise we wouldn't have Romanisation at all). Consider everyone equal rather than saying that someone doesn't understand how hard it is because they are not Chinese. 1. Yes, CrystilonZ could Romanise Chinese. 2. That doesn't say much, but no, you don't need to have extreme experience in Chinese to find correct artist name and song title. 3. I don't know where your information about staff's real life and education stems from. osu! is not a full-time job, so these people can very well be experienced in languages. And no, we don't need to pay language specialists, we never did pay anyone and even complicated metadata is being actively produced.

Conclusion:

I think I addressed all the relevant points. I want to make it clear that if some parts sound harsh, that's not what I intended. I just want this discussion to be fair for everyone. (and without Regraz's attempts to make fun of someone)

I hope this makes some clearer. The biggest issue I see you guys didn't understand is how Romanisation works internationally, outside of China. That's probably what makes you all not realise that being able to convert Romanisation back to Chinese is not the primary intention (especially not in osu!, where the original Chinese titles will be visible)
Shad0w1and
I would suggest keeping what we have right now. I have been searching around for an actual standard for Chinese to be romanized to ANSI code, however, there is no standard for that in pinyin.
The Chinese government did have some standard for romanization but it does not really applicable because it is more like an attachment for the English translation guidelines. There isn't a standard for romanizing Chinese into English ANSI code.
I did not read that ISO document but I assume it is for romanization into Latin. It should not be considered as a standard for osu RC because it is a different case.
In China right now, English road signs are mixed with pinyin with tones, English translations, separated pinyin. The government is suggesting using English translation through the country. And even though it is putting the Chinese character identified as a noun together to an English noun, this is not a romanization standard!
example:
Chinese: 青白江路
English direct translation: Blue White River Road
Pingyin: Qing Bai Jiang Lu
The government suggested English translation: Qingbaijiang Rd
Other common used romanization methods: Qingbaijiang Lu, Qingbai River Rd, Qing Bai River Rd, Qing Bai Jiang Lu, Qing Bai Jiang Rd

So let's face the reality, there isn't a standard for Chinese romanization into ANSI code. I can't understand that without a commonly accepted standard, why would you guys try to change the current metadata rule?
While the English translation shows in favor of putting words together, they do have a lot of exceptions in real word that you won't be able to find on any document. There were a lot of jokes about romanization in China and I would say please do not think too much to make a standard for osu. If there is no standard, we should go with the current one.
Mafumafu
Regardless of the totally illogic post Wafu made above, I find it interesting and ironic that Wafu’s post here is in fact contradicting what Wafu sent me in forum private message.

I even could not help laughing when Wafu said:

Wafu wrote:

…and without Regraz's attempts to make fun of someone
Maybe Wafu did a provocative (but bad?) try to labelize, defame, calumniate and libel others? However, from his PM to me, it seems Wafu himself even failed to keep his words civilized. I will attach the screenshot of that forum pm here for everyone to read.



Who do you think is actually making fun of others?

In fact, I do not want to waste time replying to what Wafu posted here as I believe if anyone would like to participate in any kind of discussions, they first ought to learn how to speak properly and get rid of any habits of assaulting others inherited from whatever personal life or background. Yet for the sake of this metadata draft, I tried my best to be benevolent and philanthropic, showing some leniency toward those who cannot speak in a civilized way and filter those profane words when elaborating my reply.

Wafu wrote:

Chinese is a logographic language. It's not pictographic nor ideographic language. They use a some characters with pictographic/ideographic features, but that doesn't make Chinese a pictographic/ideographic language. You guys don't even know your own language. Wake the fuck up.
Sadly, arrogantly defining Chinese as a logographic language is a pure fallacy. In formal Chinese writing today, Chinese characters are input by keyboards in a syllabic way, and it is the dominant method of inputting Chinese (and its Romanization as well) in osu! amongst players. Under the mixed impact of other languages and the development of currently dominant input method of Chinese in Internet, especially when it comes to romanization, "you called" logographic characteristics of Chinese is increasingly ambiguous. Your statement is already groundless and archaic because you still stay with writing Chinese characters in paper, instead of considering the input method with keyboards, which is syllable-oriented and in fact supports the current syllable-based metadata (Romanization) scheme.

More Comments Here

Additionally, I was really shocked to see that former BNG member, current(?) UBKRC member could end up in assualting others when they failed to provide solid reasons toward their own statements.

osu! Rules wrote:

Be productive with your criticism without resorting to personal attacks. Criticism is a wonderful thing when done properly, but if you're resorting to personal attacks to make your point, you're doing it wrong and you should feel bad.
I hold respect toward the entire UBKRC team as they comprise the most experienced people about criteria elaboration and modification but this just makes me quite disappointed. It is really a lackluster, and even a blemish.

At last, I do have some pieces of personal advice for the one who is all the time showing uncivilized, barbarian-like behaviors here:
1. Before involving into this discussion, learn how to speak properly, instead of acting like uncivilized philistines. Insults, personal attacks or profane content would not help you in this discussion. They only illustrate that you failed to support yourself with solid reasons.
2. Learn the basics about metadata and Romanization. I do recommend knowing the basics about Mandarin as well since Romanization is a work requires knowledge about both ends, though I do not expect you to do this much because it seems you have no idea about what is Romanization.
3. Be consistent with your behavior. Pretending to participate in the discussion actively while sending out personal attacks backstage is really a naivete and ignorant act. Especially when your participation in this discussion is full of illogical, self-contradictory, disrespectful, rude content to others. Again, those will NOT make your points accepted by others but could only impair your infinitesimally remaining reputation.
4. Try forming up solid response toward statements or ideas of others when you disagree. However, this only applies after you proved yourself fulfill the first three points. Personally, I do not expect you to fulfill them soon as the above three are already too hard for you, from your previous words. But I listed it there for your future reference.

Best wishes!
CrystilonZ
lol ok I'm starting to understand what the hell is going on. First of all let me rephrase some points being made here to avoid confusion. Please correct me if I'm wrong because this is getting crazy

Regraz wrote:

First, In formal Chinese writing, there is no logograms as well.
I believe what Regraz is trying to say here is about inputting Chinese characters using Latin keyboards. To type Hanzi characters you simply write the pinyin for them and because of there are a lot of characters with the same pronunciation, usually there will be a pop-up list like this for you to choose the characters from

For clarification I did not type the spaces there myself. The computer separates syllables for you and I believe this is what HW mentioned.s
The thing is this is just a way to input Hanzi characters into an electronic system. In the example above wo men de ai is never supposed to be the final result. The way you input stuff is not at all related to how you Romanise things.

I'd like to explain about Chinese characters being logographic as well.
You guys seem to have a little confusion about how characters are formed and how they function nowadays.
Some characters like 月 originally looked like the crescent moon. These are said to be characters with pictographic origin.
Some characters like 上 are created by trying to convey a concept, which in this case is up on above w/e. These are said to be characters with ideographic origin.
There are more ways that Hanzi characters are created but I'll not go there since they are not really related to the topic atm.
However, these are how characters are created not how they function. It's really really important to keep this in mind. As I can see this is where the misconception stems from.
Right now these characters function are to represent words or phrases. Therefore Mandarin is, by definition, a logographic language.


The whole mess above are a result of piling misconceptions I believe due to language barrier or whatever. From now on I'd like to request everyone involving in this discussion to refrain from using condescending tone, sarcasm, personal insults and anything that can impede the process. Don't take every fucking thing personally. Read these things with three pieces of chocolate chip cookie and a cup of tea.

Shad0w1and wrote:

So let's face the reality, there isn't a standard for Chinese romanization into ANSI code. I can't understand that without a commonly accepted standard, why would you guys try to change the current metadata rule?
We've expressed (thoroughly I believe) what problems the current system has. Please read all the previous points made in this discussion.
Mafumafu

CrystilonZ wrote:

The thing is this is just a way to input Hanzi characters into an electronic system. In the example above wo men de ai is never supposed to be the final result. The way you input stuff is not at all related to how you Romanise things.
This paragraph is quite illogic.

“The thing is this is just a way to input Hanzi characters into an electronic system” I have no idea on why you use a “just”, maybe you want to state your opinion that inputting Han Zi into an electronic system is not related to Romanization? This is completely wrong. Romanization of Mandarin is closely related to inputting with computers or other electronic systems.

wo men de ai is never supposed to be the final result.” What final result do you mean? From the Romanization point view, Wo Men De Ai or wo men de ai has already been a final result! By stepping ahead you will have the Chinese characters you are going to input, which is, in osu! The title/artist in Mandarin.

“The way you input stuff is not at all related to how you Romanise things.” Totally problematic, as mentioned above. It is closedly related to Romanization under contemporary prospective.

CrystilonZ wrote:

I'd like to explain about Chinese characters being logographic as well.
You guys seem to have a little confusion about how characters are formed and how they function nowadays.
Some characters like 月 originally looked like the crescent moon. These are said to be characters with pictographic origin.
Some characters like 上 are created by trying to convey a concept, which in this case is up on above w/e. These are said to be characters with ideographic origin.
There are more ways that Hanzi characters are created but I'll not go there since they are not really related to the topic atm.
However, these are how characters are created not how they function. It's really really important to keep this in mind. As I can see this is where the misconception stems from.
Right now these characters function are to represent words or phrases. Therefore Mandarin is, by definition, a logographic language.
This paragraph is even more illogic than that above. I think it is you who have quite a few confusions toward Chinese/Mandarin and the characters.

You did a try to bifurcate origin and function of a word (actually they are combined and in a synergy now, however) by referencing some standard that is pretty irrelevant to what we discussed here. I think you missed (or omitted) the title of the the ISO file you quoted, is called “Graphic technology -- Symbols for text proof correction”

ISO wrote:

ISO 5776:2016 specifies symbols for use in copy preparation and proof correction in alphabetic languages and in logographic languages. It is applicable to texts submitted for correction, whatever their nature or presentation (manuscripts, typescripts, printer's proofs, etc.), and for marking up copy for all methods of composition.
See? This standard is specific for copy preparation and proof correction. The standard has to classify languages into types for the sake of copy preparation and proof correction since in this standard, copy preparation and proof correction of alphabetical and logographic language differs. Moreover, I believe copy preparation and proof correction is quite digressing and deviant from the topic here.

So I would like to borrow your sentence here:

CrystilonZ wrote:

It's really really important to keep this in mind.
Actually, more pertinent standards have already been provided by Hollow Wings above, however, you totally give no attention on them when posting stuffs here while bring up this standard. This is really not a good manner. And it failed to be a support to your statement.

I have some other comments:

CrystilonZ wrote:

From now on I'd like to request everyone involving in this discussion to refrain from using condescending tone, sarcasm, personal insults and anything that can impede the process. Don't take every fucking thing personally. Read these things with three pieces of chocolate chip cookie and a cup of tea.
It is really interesting to read this: “I'd like to request everyone involving in this discussion to refrain from using condescending tone, sarcasm, personal insults and anything that can impede the process.”

So who do you think is using condescending tone, sarcasm, personal insults and anything that can impede the atmosphere of this discussion? Wasn’t the discussion going on well until someone abruptly broken in and started to insult others? This sentence from you, misleads people to think that, many people are violating the code of conduct while, in fact, there is only one (maybe) who is doing obnoxious stuffs.

CrystilonZ wrote:

Please read all the previous points made in this discussion.
This is exactly what I want to say to you, though actually I would like to say: Please read all the previous points made carefully in this discussion. Rest of your misconceptions have been already explained by Hollow Wings and Fycho so, if you choose to ignore them and force your idea, then I have nothing to do with that. People do not want to explain over and over again as that is pretty time-wasting.
Hollow Wings
pretty clear that almost all of wafu's replies are nonsense, just like what CrystilonZ did.

------------

Wafu wrote:

So, first of all, I'm quite surprised you are even trying to prove how Chinese is not logogram language. One says pictogram, one says ideogram, one says ideophonograph. I feel like you're trying to defend this so much that you have to take every single thing we said (even out of context, by the way) and simply make up something and say "do your research", while completely ignoring what we said.
i'm not defending you anything, i'm picking up international standard to show concepts that have officially confirmed, not what you're using as usual words or just some wikipedia instant knowledge.

Wafu wrote:

You, as Chinese have no priority in this matter, just because it's about Chinese,
↑ and this the most hilarious thing i've saw this day.

i'm Chinese so i have no priority in this matter? you gonna be kidding.
on the contrary, Chinese people have the exact highest priority in this matter, just because it's about Chinese.

i'll mark this so hard so that it can be a very useful joke to reply everything you have post and want to post.

Wafu wrote:

so stop acting like you are the better one and we know nothing because we didn't read document X, which nobody's even provided us before (and one of you in particular accusing others of not reading it, while misunderstanding it). Today's Chinese language is using only logograms. Origin of many of these logograms is pictographic or ideographic. You have to realise that something having pictographic/... features doesn't mean it's not logogram. I think you should know it, if you want to use it as an argument. What Hollow Wings said about this, by the way, is exactly proving that CrystilonZ was completely right on the fact that Chinese is logogram language, it's just that HW didn't understand the concept mentioned in the ISO file.
nonsense.
and how ignorant.
there're tons of Chinese characters' still using their pictographic and ideographic features. i'm even too tired to give examples.

if you didn't read an international document that lot's of official governments of countries identified, then i'm better than you for sure.
because you are still ignorant about what the world's level common concepts are, and it is you who's the one don't understand the concept mentioned in the ISO file.

besides, nobody provide us that document as well, we find it by ourselves, and try hard to catch the international standard.
sadly that you didn't do that, and just like you talking in your own area lower than international level.
if you still have no interest in reading and knowing some international identified informantions, then you can still keep things understood from nowhere.
that leads to the fact that your understanding are just misunderstanding, and you refuse to correct it.

------------------------------------

Wafu wrote:

  1. 1. You have to remember that ISO being international doesn't mean that we have to follow ISO (After all we would be breaking many of their standards, even in the other Romanisation systems). There are many references and standards that are not ISO and are better in quality and design. It's not like taking one thing that benefits you is going to help. Regardless, as I've read this standard (and if you are concerned that I may not have done some research, this is a tiny part of what I've read to make my opinions on this issue, the research is much deeper than a single ISO document), I think it benefits you less than you think it does. In fact, from what arguments you are quoting, you don't seem to understand it very well on your own (or maybe you just expressed yourself poorly, but you misrepresent the document you are promoting here).
nonsense.

do you ever know how ISO7098:2015 was born?
it's done after thousands of language specialists' research, and went through a long time even after 3 times' editions, refering all of works related to Chinese romanisation to do their best to avoid troubles from it.
then eventually give a result as that ISO document.

you are saying your little reserch is greater than theirs? and so that osu community can override that ISO standard?
another joke confirmed.

we've already showed what inconvenient troubles those things may occur, and those are still small part of it.
but you even haven't cared any of them, even thought you are like albe to solve them with your research.


besides, other rules mentioned like "The Basic Rules of the Chinese Phonetic Alphabet Orthography" is a country level standard which called "GB/T" identified by PRC government.
and there're a lot more documents like that, about Chinese romanisation.
i still don't wanna mention these because osu community is an international community, and i thought it's not appropriate to rule it with one countries' standard, since it's just made for ourselves (and actually only in elementary education and test area, as a tool to learn how to read Chinese characters), not for other non-Chinese people.
then ISO shall be a better choice, obviously, at least better than GB/T.

you gonna throw away ISO the international standard and pick up some PRC standard?
Wafu: "You, as Chinese have no priority in this matter, just because it's about Chinese,"
hell, we Chinese people have even higher pirority about standards made by our own country.

or you wanna create some osu rules aside of them, and even result in more and more troubles?
that will be surely a worse system than the current, not a better one.

------------

Wafu wrote:

  1. 2. I already addressed this point in the previous post, but it was kinda taken out of the context. I already explained this in relation to Chinese script incompatibility. Anyway, what you're trying to say here is quite a non-sense. None of the listed languages are actually ideographic. All of these are logogram languages that partially use ideographic characters, but mostly characters that just originate from ideographic characters. The languages itself use logograms.
nonsense, and ignorant, again.

see what i've said about Chinese characters above, tired to repeat.

besides, do your research about Latin characters' pictographic property, like why you call eye "eye", to become a person knows Latin characters actually more than me.

------------

Wafu wrote:

  1. 3. This has been addressed already. There's no reason why the missing reversibility factor would impair Latin script users from reading or memorizing this (that, however, is impaired by the current system which is reversible). It only has impact on people who actually can use Chinese language, these people do have a solution. The original title/artist is still present, so they don't need to reverse it to logograms. For Latin script users, there's no reason why they would want to reverse the text.

    This is the part that many of you have taken out of the context, and therefore misunderstood it. Again, Romanisation (internationally), is not designed for people who are fluent in Chinese, so there is no reason why they would want to convert it back to logograms. For those who are fluent in Chinese, you literally have it in the original title/artist.

    "Transliteration won't do, we need to do "Transcription"" argument doesn't make much sense. Not only, as I explained above, it's really not needed in this scenario, but at the same time, you are saying this and want to support a system that omits the phonetics? That literally makes it transliteration.

    "we chinese ourselves even cant understand what those words said in a short time, if they are all written in Latin characters of pinyin one by one" Don't make this up. If you can't understand Romanised text, it's because you can't process Latin script, again, there's the original title/artist for you to clarify it. You are not the primary target of the Romanisation. It's more important for a regular player to be able to memorize and read the title/artist, than for Chinese player to memorize, read and understand both the original and Romanised titles/artists.
the reason i show you it's not reversible is mainly telling you that Chinese characters are really special in romanisation area.

and as i have said, and i say this again here, is that:

THE TRASNCRPTION STANDARD OF CHINESE-LATIN IS NEVER EVER COMPLETED BY FAR.

do you ever know how much troubles through the way we wanna find the method to converse Chinese the ideophonographic characters into some system composed with alphabetic characters?
that work have been last for over 70 years, and it's still in program.
which means all of so called Chinese-Latin transcription method are uncompleted, that's the most important point i want all of you noticed.

and the best system from all of them, is the ISO.
why???
because Latin characters of pinyin for Chinese is made for us to show people who don't know Chinese characters' reading.
this is the original purpose of pinyin, it will never replace Chinese characters itself, because it can't do what Chinese characters do.
the romanisation of Chinese working is actually: Chinese character → pinyin.

AND, THAT WORKING STILL HAS NO STABLE STANDARD.

what you've post like " Don't make this up. If you can't understand Romanised text, it's because you can't process Latin script, again, there's the original title/artist for you to clarify it." is just completely nonsense.

have you earn any idea about how difficult it is to transcript Chinese characters into Latin characters even you wanna make its words separated?

you may have no idea because you just missed what i've posted with so long components.
well the deep reason of it is just because you barely know anything about Chinese.

and what the heck necessary is if i shall understand romanised text or not? Latin script is just one of tools to show how to pronunce Chinese character, and nothing more to it.

what's more, none of tools can do that better because that's how complex Chinese character's pronunciation is like.

and that's why we prefer one by one character transcription because it occurs the least trouble, at any side of the romanisation work.

------------------------------------

Wafu wrote:

B
  1. 1. Not worth talking about. This is a bit more out of topic and should already be clear from the previous discussion.
  2. 2. The same way we're doing it for Japanese. And we are doing it for Japanese. The "Metadata Heap" Discord server is quite a big one and people do solve issues here quite quickly and effectively. Even for the languages they don't know very well. You really underestimate this community if you think it will be only and only up to staff members. Sure, they have to recheck, but if this is discussed in the channel, they generally have a good starting point. So far, I haven't seen a problem that wasn't resolved there, it really shouldn't be that big of a drama that you make it look like (And yes, even QATs/GMTs are active here, but they don't do majority of the requests on their own). This is, therefore, not an issue.
1. how is that out of topic when all Chinese language families' standard is Mandarin Chinese, which CrystilonZ really wanna focus on building the system for?
it is just like you are refusing to face it.

create all of those Chinese language's individual rules, or just follow Mandarin Chinese.
i don't see this is out of topic.

2. how ridiculous, that you compare Chinese to Japanese.

Japanese's kanji is part of Chinese culture, the main Japanese part is still structured by alphabetic characters: hiragana and katagana.
no trouble with most Japanese romanisation works when they are hiragana and katagana, just like other alphabetic languages.
only have trouble with kanji, which is really "Chinese" like.

However.

it's still a small part of Japanese, which means it can be recognized easily with those kanji's property.
then easily romanized into its proper Latin characters from its pronunciation, even kanji are written together:
there're still rare conditions need to be discussed.

that will not work for Chinese characters, because Chinese are structured with only hanzi.

"落下"

↑ simple Chinese word here. until i tell you what is said before or after this word, or the sentence contain it, or even the whole article,
you won't know how to read this word.
and that means you won't know what's its correct pinyin.
and that means you can't romanize it.

this is just a simple word example, that will be widely happend in sentence. just stop here because there're enough examples before.

even you have a great group of people that willing to do it, i will always doult that you can actually do it correct, because there's no standard for it and even Chinese people may not know what's the correct answer.

besides, google or wikipedia won't help to reduce the staff's pressure when you guys do the previous work and just leave recheck work to them.

if you always meet song title like: "达拉崩吧", "但愿人长久", "唐僧在女儿国抒怀并看着女儿国王的眼睛" or "如果下雨的时候你拖着行李箱子站在屋檐下面那么其实我没有足够的时间找一个好一点的理由抛弃家里面的狗坐上K667次列车到你在的地方找个商店买一把伞然后给我妹妹弹吉他因为她要参加比赛所以我回不去了我也不会给你说我泡面的碗还没洗好“.

------------------------------------

Wafu wrote:

C
  1. 1. 1. Same as for the the last point in B. This is not at all an issue. The second part of it, I don't think people take metadata DQs so negatively. I don't remember a single time it happened that people stopped mapping songs of certain language due to complicated metadata, even since metadata became more strict.
  2. 1. 2. Already addressed, this is not what should be discussed. Current system doesn't solve this even though you may imply it does. It doesn't.
  3. 1. 3. The result is "keep the current state" because of the conservative stance. From what I've seen, Chinese people argued for this system poorly and detached from the community that it's primarily about. Now the target community argues for something else, that doesn't mean you just keep conservative stance because we are not Chinese. The argument can never be that "it was discussed by Chinese community" because it's not only about you. We also don't say: "You can't judge most of the things because you don't know how Latin script languages work.", so don't do that to us.
  4. 2. Already explained in above paragraphs.
1.1 i think you just don't get it at all, for you don't understand that no standards is identified. this is addressed lots of time.

1.2 the current system is a better one than what you wanna bring. this is addressed lots of time, even with its reason.

1.3 and that doesn't mean you are the one to make it forward, because you don't have that ability. this is addressed lots of time.

and ofc i'll say and do this as much as i can: you can't judge most of things because you don't know how Chinese script language work. and because of that, you also don't know how romanisation of Chinese script language work. that's why you are still here argued with nonsense.

maybe i know less about Latin script than you, even i can type and compose English words myself.
and you know less about Chinese script than me, even you can hardly know any character of Chinese.

------------

Wafu wrote:

  1. 1. Again, nothing is going to be "up to Chinese staff", they don't even communicate about metadata much. It doesn't need to be automatic. Japanese is also not automatic. Your assumption that people who are not Chinese can't do Chinese metadata is incorrect. There are people who do it (e.g. in the aforementioned Discord server for metadata), some do it more reliably than Chinese people. There are many people who look up Japanese characters one by one and they are very accurate with it, even though you have to think of exceptions (there are even Chinese alternatives that are more accurate, btw.). This is not a problem again, unless you assume that where you are born determines what metadata you can do, which is not true.
again, nonsense with ignorant, that you compare Chinese to Japanese. won't text more addressed thing here.

Wafu wrote:

  1. 2. I don't understand why you would enforce some "preconditions" in your post. I understand you want people to do some research, which is okay, but remember it works vice-versa.
because i think we shall talk about things at the international stage, so the standard of international level document is the basic prediction of our topic and discussion.

but since you thought that you can even create something that overrides international standards and throw those things international groups confirmed and identified away, ofc you won't understand anything of it.

Wafu wrote:

  1. 3. Ancient Chinese was not an ideogram language. You've also proven CrystilonZ right, you just used different terms.
how ignorant that you even don't understand what Ancient Chinese is.

and Ancient Chinese do is an ideogram language.

------------------------------------

Wafu wrote:

To your point about us being "wrong". You don't know what Romanisation is about. The idea that Chinese Romanisation's goal is not at all to help non-Chinese people is non-sense. We are not in China where you don't care. We are in osu!, which is an international game. Concept of Romanisation in China is different than Romanisation worldwide (target = people using Latin script, this is what we need to use, otherwise we wouldn't have Romanisation at all). Consider everyone equal rather than saying that someone doesn't understand how hard it is because they are not Chinese. 1. Yes, CrystilonZ could Romanise Chinese. 2. That doesn't say much, but no, you don't need to have extreme experience in Chinese to find correct artist name and song title. 3. I don't know where your information about staff's real life and education stems from. osu! is not a full-time job, so these people can very well be experienced in languages. And no, we don't need to pay language specialists, we never did pay anyone and even complicated metadata is being actively produced.
nonsense with ignorance.

go and learn history about how pinyin was born. and you'll find that it's not at all to help non-Chinese people is non-sense is wrong.

i said so called pinyin you can see today is almost romanisation of Chinese, but that's not the purpose it was made.
the pinyin is just made for EVRYONE OF people who don't know how to read Chinese character, including Chinese people themselves.
their are lot's of other formations of alphabetic language AND ideophonographic language method that can do that.

romanisation of Chinese is just one of it, and it's not the thing that effect the convertion much.
it's the mode of convertion which can't be easily decide effect the most.

back to the romanisation of Chinese to non-Chinese speakers: they are all the same as Chinese babies who don't know how to read Chinese characters. in that case, romanisation is never a project to serve non-Chinese speakers, but all people who don't know how to read Chinese characters.
so if you require some "Romanisation worldwide" then sadly as i've already addressed, no standard has been identified.

and from the last part i can see that you still don't get it, so i address this again: there is no standard of it.

with low efficiency, not international, and result in wrong ends.

Wafu wrote:

Conclusion:

I think I addressed all the relevant points. I want to make it clear that if some parts sound harsh, that's not what I intended. I just want this discussion to be fair for everyone. (and without Regraz's attempts to make fun of someone)

I hope this makes some clearer. The biggest issue I see you guys didn't understand is how Romanisation works internationally, outside of China. That's probably what makes you all not realise that being able to convert Romanisation back to Chinese is not the primary intention (especially not in osu!, where the original Chinese titles will be visible)
the biggest issue here now is:

1. you think you can override international standards, before you even starting talking about romanisation works internatinally.

2. striked with thoughts like "You, as Chinese have no priority in this matter, just because it's about Chinese," which just make other people like "what the hell is this guy even talking about".

clear your mind before you make any further nonsense like things above, i don't see what direction all these mess would leads.
Wafu

Regraz wrote:

Regardless of the totally illogic post Wafu made above, I find it interesting and ironic that Wafu’s post here is in fact contradicting what Wafu sent me in forum private message.

I even could not help laughing when Wafu said:

Wafu wrote:

…and without Regraz's attempts to make fun of someone
Maybe Wafu did a provocative (but bad?) try to labelize, defame, calumniate and libel others? However, from his PM to me, it seems Wafu himself even failed to keep his words civilized. I will attach the screenshot of that forum pm here for everyone to read.



Who do you think is actually making fun of others?
Your false accusations (of us not reading stuff or not being professional) did, indeed, make me send you this message (and it is called exactly that: "Private message"). I'm not making fun of you as it was not public, you making it public doesn't mean I'm making fun of you. I wanted you to know that putting this down to "there's no research" was unfair of you, as you didn't invest your time into the research either. Was I being rude to you in the private message? Yes, as as you were when you clearly did, intentionally ridicule the proposal, except I at least could keep it private.

I did not label anyone. If there is a single case in my post where I did so, I'm willing to apologize to that person. I didn't use any racial slur (I used epithet as an example that is actually relevant to the discussion, didn't aim it at anyone, that's pretty clear in my post), I didn't use "Chinese" with relation to any stereotype or in an insulting manner (if you are concerned about the word "conservative", I stated the definition in the beginning, this is not related to politics, it's just "I don't want change" stance). I did not defame/calumniate/libel others, I did give counter-arguments to all the points that seem to be invalid based on the reasons I've given (majority of them were given before, but I had to make this post essentially again, because the discussion in the proposal was mostly ignored). Yes, I did say "you should know this" or something along the lines. That is because if you have some requirements for us (e.g. reading ISO documents), you should at least know things that you are required to know to understand the document in the first place.

I did not use profanity in that post when elaborating your reply. In fact, I didn't use profanity in that post at all. Proof. I was fair to everyone publicly, but mentioned that I did not like your attitude towards us when publicly humiliating us and suggested that people would refrain from it in this discussion. Rewriting a table (that was used to compare and summarize points that were elaborated before) to something that exists for no other reason than to humiliate someone, is not what anyone expects in a discussion where people are trying to give arguments.

Regraz wrote:

Sadly, arrogantly defining Chinese as a logographic language is a pure fallacy. In formal Chinese writing today, Chinese characters are input by keyboards in a syllabic way, and it is the dominant method of inputting Chinese (and its Romanization as well) in osu! amongst players. Under the mixed impact of other languages and the development of currently dominant input method of Chinese in Internet, especially when it comes to romanization, "you called" logographic characteristics of Chinese is increasingly ambiguous. Your statement is already groundless and archaic because you still stay with writing Chinese characters in paper, instead of considering the input method with keyboards, which is syllable-oriented and in fact supports the current syllable-based metadata (Romanization) scheme.
Why am I arrogant for saying that Chinese is a logographic language? What you are saying is a fallacy, because you are changing the topic to relation of language to keyboards. Yes, logograms can be typed as syllables on a keyboard. That however doesn't change the class of the characters. This is because you can't fit all the characters on one keyboard. Language is logographic if it is using primarily logograms. Hanzi, by definition, are logograms, that is what makes Chinese logographic language. It's the characters that are logograms, that makes it logographic language, even though you write Chinese differently on the keyboard, it doesn't change definition of logogram, nor the fact that the resulting characters are logograms. If couldn't find an example of a Chinese character that is not logogram, I don't think that several standards, including ISO would get it wrong.

Regraz wrote:

Learn the basics about metadata and Romanization. I do recommend knowing the basics about Mandarin as well since Romanization is a work requires knowledge about both ends, though I do not expect you to do this much because it seems you have no idea about what is Romanization.
I said what it is about and what the intention is (even in the proposal posts). You just ignored the reasoning completely.
-Atri-

Wafu wrote:

You, as Chinese have no priority in this matter, just because it's about Chinese,

Let me reword that:
"You, as Westerners have no priority in this matter, just because it's about Chinese"
Sieg
Summary on Russian \ Cyrillic Romanisation:

Current wording, was agreed as needed to be improved in previous discussion 10 months ago.

draft wrote:

Cyrillic Romanisation: Use BGN/PCGN system for Russian/Cyrillic. Е and е should be romanised as ye if it stands alone or after a, e, ё, и, о, у, ы, э, ю, я, й, ъ, ь. In other cases, it should be romanised as e. ё should be romanised to ye, however, use yo or o to avoid usage of special characters. Ignore any other rules in the file provided, these are either irrelevant or wouldn't help in the game. If an artist uses a preferred romanisation, follow it regardless of this rule. For most of the other characters, refer to the first page of this document.

draft wrote:

Songs with Russian metadata must be romanised using the Cyrillic Romanisation method in romanised fields when there is no romanisation or translation information listed by a reputable source. The same applies to the Source field if a romanised Source is preferred by the mapper.

Lack in the current wording is attempt to generalize all Cyrillic based languages with replacement rules discussed and agreed only for Russian language.
While we suggest to use BGN/PCGN there is ASCII limitation in osu! for romanisation fields. So it was discussed and agreed on replacement rules for "special" characters from other standards e.g. from ISO 9:1995 for "ё" - "yo" (in BGN/PCGN it stated as "ё" - "ë" or "yë"). Also exceptions was done for some phonetic sequences.
Considering that there are may be cases where expanding rules that work for Russian language to others wont give acceptable results. Same for simply mention of BGN/PCGN for other Cyrillic based languages because as stated we have ASCII limitations or other not covered unavoidable exceptions.

proposal for new wording (changes are highlighted)
Russian Romanisation: Use BGN/PCGN system. Е and е should be romanised as ye if it stands alone or after a, e, ё, и, о, у, ы, э, ю, я, й, ъ, ь. In other cases, it should be romanised as e. ё should be romanised to ye, however, use yo or o to avoid usage of special characters. Ignore any other rules in the file provided, these are either irrelevant or wouldn't help in the game. If an artist uses a preferred romanisation, follow it regardless of this rule. For most of the other characters, refer to the first page of this document.
Songs with Russian metadata must be romanised using the Russian Romanisation method in romanised fields when there is no romanisation or translation information listed by a reputable source. The same applies to the Source field if a romanised Source is preferred by the mapper.
As for other Cyrillic based languages I propose to leave it to case by case scenario because current amount of such sets are negligible.


Kurai wrote:

- Cyrillic Romanisation should follow the BGN/PCGN system (except for the letter ё in Russian which should follow the GOST 2002(B) system). Read more here: http://up.kuraip.net/032209ex3724.pdf
Well.. we can separate Ukrainian and discuss details but due to extremely low amount of beatmaps and people involved I don't see this as productive work.
Wafu
@Sieg: Yes, I did split it previously, but somehow it didn't end up edited in the proposal. Not sure where the error was, I maybe didn't remind them to add it. I will mention this. Your proposed wording seems about right.

Answering to Hollow Wings by the ----- paragraphs:

  1. 1. First of all, you did use the ISO document as your argument, but you didn't even know that the citation about "ideophonograph" language was just confirming what CrystilonZ posted. You agree with ISO on the same thing that you disagree with CrystilonZ on. They state the same thing. Second of all, stop taking what I said out of the context again. I didn't say Chinese have no say in this matter, I said they are not the primary target, Chinese don't need Romanisation to be able to read, memorize and search the Chinese title/artist. It's Latin script users who need it. I did explicitly explain what I mean by this, you call me ignorant in multiple parts of your post, but at the same time, you ignored what I said about this part. This makes complete sense, if you don't just cut it like this.

    "i'll mark this so hard so that it can be a very useful joke to reply everything you have post and want to post."
    Not sure if you are even serious with this part, but either way, it's not a very useful joke. You are encouraging people to use it against us, instead of using arguments. Please refrain from that. If you want a serious discussion, stop humiliating people and taking stuff out of context.

    Yes, I agree with that point. Some Chinese characters indeed do use "pictographic and ideographic features". You even quoted me saying that. That doesn't make the language pictographic or ideographic, because even the characters with pictographic or ideographic features are logograms. That makes the language logographic. Why do you call something non-sense and then say the same thing?

    I didn't say I did not read the document. In fact I did read it before the proposal was even submitted. There is, however, more stuff important to read than just one standard. This was in relation to your false accusations—we were accused of not reading that and were treated as uninformed/unintelligent/unskilled/whatever you wanna call it. Even if I didn't read it, you wouldn't suddenly have the right to override whatever I said. We could do the same thing, tell you that you didn't read what we did, and ignore you from the discussion, we have never done anything like that. We are discussing with you no matter how much research you have.

    I'm not refusing to correct anything. I won't just correct something that we have a pretty sensible argument for.
  2. 2. Not sure why you call it non-sense again. How is it relevant whether I know how a document was made? I never said it is bad, I said it is not the only reference we should consider and that it is not the best for osu!. Already gave reasons for that.

    I never said my "little research" is better than ISO's, so I don't understand this accusation again. osu! community can ignore ISO standard because we are not obligated to use ISO standards. We are breaking many standards in osu! (including all the Romanisation systems we use right now (except for Korean), even the current Chinese Romanisation breaks its standards). If we see the benefit of breaking a standard, we can do that. Because you promote ISO standards in osu!, I have a question. How would you deal with the issue that ISO actually has, which is that their research takes such a long time that by the time of the publication of the documents, the data is outdated and sometimes limiting (some going even 10 years outdated, despite being released in 2015)? That's why I mentioned that having a wider knowledge rather than relying on one standard is important. It limits us. And by this, I'm not saying ISO is bad, I'm saying there are issues with every standard, sticking to one that would cause us issues wouldn't be the best choice.

    Higher priority about standards made by your own country? It doesn't belong to your country. I already explained that you taking this out of context doesn't help anything. I explained why Chinese don't have the highest priority here very clearly.
  3. 3. Explained it in the 1. point. And several posts before. It's not non-sense, and it's not ignorant. You are the one who took what I said out of the context.

    Why would I do my research on Latin script's pictographic property? Why are you even commanding me to do a research, again? And why do you think you know Latin script better than I do? I don't understand where this is coming from and why'd you even have to make this comment.
  4. 4. The first part, we've been over this. I even addressed this and again, this is true, but it doesn't make ISO the best for our needs.

    Second part, more accusations? As I already said, people can convert Chinese to Latin script this way (already mentioned which part of community can), you are intentionally making it look harder than it is. Chinese to this system is not hard, it's the other way around which is harder, but reversible way is not necessary because you have the original title here.

    In context of osu!, it's not only about pronouncing the Chinese characters. It's about reading, memorizing and pronouncing. All of these parts are crucial, you can't ignore them again, I did explain this thoroughly. Even if the opinion among the majority of the community was that only pronunciation is important, then the current system still has more pronunciation problems than the proposed one. Already explained that too.
  5. 5. This is off-topic because this topic was already clearly explained, and why the case-by-case system would be used for it (as it is, even now). Why are you giving us only two options, when we explained a third option, which anyway would mostly tell you to use Mandarin Romanisation?

    I don't compare Chinese with Japanese. I compare the way we Romanise Japanese, because it has the same problem. I don't understand what's your problem here. I was obviously talking about Kanji, as that's the part where Romanisation is complicated. Again, there have been no issues.

    "besides, google or wikipedia won't help to reduce the staff's pressure when you guys do the previous work and just leave recheck work to them"
    What? Who talks about Wikipedia? That's literally even said to be an unreliable source for metadata. This system doesn't mean that they will have to check it from now and on. They are already checking it, so it's not this systems problem. I said these people, who actually do check metadata actively (not exactly calling it staff), would give quite enough references, and from what I've seen, majority of cases have been successful even with very complicated metadata. The pressure on staff will be minimal because of this, already explained this though.
  6. 6. Could you please reword your first point? It doesn't seem related to what I said at all. You are telling me that I don't understand something when I was talking about an experience. (no, this is not meant to be provocative as some of you will think, this is not to mock HW's English, I genuinely can't understand what the meaning of that sentence and want HW to reword it if they want me to understand it).

    The second point, I could say that too. It was only addressed internally within the Chinese community. The target group are Latin script users, don't just end it with "it was discussed many times in past".

    Third point, not sure why you are personally attacking me. How do you know what my education is, what my job is, what my real life is? You don't know any single thing about my personal life, so don't act like you do. I could just tell you that you use VPN and accuse you of not being Chinese and can't participate in this discussion. I never did anything like that, so stop doing it.
  7. 7. If you have problem with me comparing how osu! works for 2 different Romanisation, I think there's a different problem. Stop calling me ignorant if you ignore what I've even written in that paragraph. It's also not non-sense. I literally just say how people work. How can that be non-sense? That is an observation.

    Already talked about this. I also don't ignore you because you didn't read what we did.

    Oh yeah, call me ignorant again. Ancient Chinese was, again, a logographic language because it used logograms, not ideograms. Yes, the logograms did have ideographic features. Again, that doesn't make the language ideographic.
  8. 8. As for the rest, I don't think I have to respond to that. I already explained that this is how Romanisation should work in this game, I did repeat this in 3 posts or more already, so I don't think copying it is required.
As for the end "you think you can override international standards, before you even starting talking about romanisation works internatinally", it was explicitly explained multiple times.

"striked with thoughts like "You, as Chinese have no priority in this matter, just because it's about Chinese," which just make other people like "what the hell is this guy even talking about"", sure, it's going to sound bad if you take it out of the context like this.

@Firis Mistlud: This thread is for discussion about the proposal. Not about trolling and taking this out of the context.
VINXIS
ehat are u on wafu and why do u expect people to follow your Almost Character limit hitting posts can u ATLEAST stop vomiting a bunch of dictionary words

you are talking to an international community and you expect people who speak English as a second/third/fourth hand language to follow ur posts saturated with random noise Ok
CrystilonZ
To be honest I'm very pissed off right now and you have no idea how hard it is for me to post in this calm manner.

First off

Firis Mistlud wrote:

Wafu wrote:

You, as Chinese have no priority in this matter, just because it's about Chinese, <--- True

Let me reword that:
"You, as Westerners have no priority in this matter, just because it's about Chinese" <--- Also true

You, as Chinese have no priority in this matter, just because it's about Chinese. I believe all people here are civilised people and civilised people argue with reason.Read more about this here <Argument from authority>

Hollow Wings wrote:

on the contrary, Chinese people have the exact highest priority in this matter, just because it's about Chinese.
NONSENSE

Secondly

Hollow Wings wrote:

CrystilonZ wrote:

This is not exactly true. If it were Mandarin would have been dead a long while ago because the only way to communicate would be carrying a crap ton of paper with you at all time and write stuff when you want to communicate.
In English context it would be equivalent to you guys seeing or hearing /tīm/ (IPA stuff. This reads time). Intuitively the first thing that come into your heads would be the time. Tick-tock clocky stuff. However under different contexts:
"Can you buy me some /tīm/. I'm going to use it to cook dinner." In this case /tīm/ is the herb thyme.
"I don't have enough /tīm/ to do my homework. It's due tomorrow." In this case it's "time"
"Two /tīm/ two equal four." In this context it means multiply. 10/10 grammar.
As you can see they are reversible with context. And when you guys speak to each other you're actively tracing back to the original Hanzi characters using their pronunciation. Therefore, saying that it is not reversible is not true. It's harder in Mandarin (410 syllables - crap tons of words. Do the maths) but the fact that there are people speaking Mandarin proves the fact that it's possible.
lol

NONSENSE.

i think you still don't have enough cognition about how Chinese words and sentences can become.

again, you CAN'T simply know what those Chinese character exactly is, until you need to fully understand all of conponents both in and out of it.
if you just get the sentence without any other notice, you will never be able to do that, which means that sentence's meaning is various.

here are some examples:
a. one best example here, which shows that if you make mistake with it, you may got big trouble.
"Gu Niang, Shui Jiao Yi Wan Duo Shao Qian?"
this sentence mainly has two meanings:
1. "Hey gril, how much it costs if i buy a bowl of your dumplings?" (姑娘,水饺一碗多少钱?)
2. "Hey gril, how much it costs if i sleep you one night?" (姑娘,睡觉一晚多少钱?)
this widely happens in electric alphabet systems without tones, just like osu system.

b. a more common one here.
"Jie Dao Shou Zhang Zai He Shang"
this is too complex, i just do some transcription, and you may just do your mathematics mapping and see if you can figure out all of that sentence may means:
1. "Jie Dao" → 接到(catch/catch up/get/take/etc.), 街道(street/road/way/etc.), etc.
2. "Dao Shou" → 到手(already get sth./reach your hands/etc.), 倒手(transfer things between hands/buy in&out/left hand/etc.), etc.
3. "Shou Zhang" → 手掌(palm/people you trust/etc.), 首长(boss/highest level person/etc.), 收账(charge/blackmail/etc.), etc
...
oh hell, i won't continue.

this also happens even you have words separated:
"Jiedao Shouzhang Zai Heshang"
↑ maybe try your best to figure out what this means, and i can predict that you may find out at least 4 of meanings.

that's why i'm always saying why it's complex:
alphabetic characters can be transliterated immediately, even if you don't know what that word means.
and this won't work to ideophonographic characters, expecially Chinese characters.

and that's also the detail part of why it's not reversible.

c. some special meme here.
"Shi Shi Shi Shi Shi"
non-Chinese speakers may have no idea what's this.
but it's a popular article called "施氏食狮史" which is a best example to show how hard it may effect us to just read Chinese with only pinyin (or romanized Latin characters).

if you insist your opinion then try to figure out what this sentence means:
"Ji Ji Ji Ji Ji"
just mention: that's also a wonderful article in Chinese writing.

this is just one form of Chinese meme, there're tons of others in Modern Chinese.
like "爷爷", "不星", etc.
this is what general phenomenon in Chinese language environment and its romanisation like.

Latin-Chinese transcript is not reversible, is the exact truth.
This is for Regraz. Please allow me to demonstrate how Hollow Wings have been posting so far.

lol
DID YOU READ ANYTHING I POSTED AT ALL EXCEPT THE LAST LINE?
My opinion here partly AGREES WITH THE CHINESE SIDE and saying it's NONSENSE means you are CONTRADICTING YOURSELVES. This is what I said

CrystilonZ wrote:

As you can see they are reversible with context. And when you guys speak to each other you're actively tracing back to the original Hanzi characters using their pronunciation. Therefore, saying that it is not reversible is not true. It's harder in Mandarin (410 syllables - crap tons of words. Do the maths) but the fact that there are people speaking Mandarin proves the fact that it's possible.
Maybe people like Hollow Wings are not good at speaking English. Let me simplify this for you.
Phrases in pinyin are reversible with CONTEXT, but it is indeed quite hard (or impossible if the prerequisites aren't met) compared to other languages because in Mandarin there are a lot of HOMOPHONES.
Every single example you provide either does not have enough context or is ambiguous because of HOMOPHONES. EXACTLY LIKE I SAID
Furthermore you even mentioned the input method yourselves. What you input is pinyin and if IT'S IN ALL CASES IRREVERSIBLE HOW EXACTLY DO COMPUTERS CHANGE THOSE INTO HANZI CHARACTERS?
Read more about this fallacy here <Faulty Generalisation>

I'm going to stop here. Can you see that the text above is really condescending and provoking?
There are a bunch of misconception here because of bad interpretation or bad agreements plagued with fallacy and unfortunately I don't have enough time to go through all of them.
abraker

Tofu1222 wrote:

abraker wrote:

Any thoughts about mapping style or patterns the maps have being in tags?
Don't you see that you are in the wrong topic.
Metadata covers tag guideline/rules. How am I in the wrong topic?
Sieg

abraker wrote:

Any thoughts about mapping style or patterns the maps have being in tags?
I don't see any restrictions for this right now as long as they are related to the set. Also don't think that this worth specific mentioning.
VINXIS
the discussion of this topic shouldve ended like w few posts sfter fychos

it makes absolutely no sense that the chinese do not have the higher priority when talking about chinese.. it is Quite Literally the language that they speak AND they are also... Quite Literally... the most Affected by the proposal regarding chinese metadata.. not sure how the priority of a group of people is parallel to a nonreasonable discussion either..

why has the discussion even devolved to the point where we are talking about he method of speaking to one another in chinese when this is in fact about chinese metadata which is mostly targeting the track's title and artist

i think what fycho said makes sense in that chimese metadata should be separated by syllables since it is the standard of romanization used in many other places evidently and it's easier for chinese people to understand the titling + it really doesnt make it any harder to read the title/artist with separated syllables so i dont see the harm in staying consistent with other platforms alongside making it easier for chinese people to..... read their own language L
Wafu

Mishima Yurara wrote:

it makes absolutely no sense that the chinese do not have the higher priority when talking about chinese.. it is Quite Literally the language that they speak AND they are also... Quite Literally... the most Affected by the proposal regarding chinese metadata.. not sure how the priority of a group of people is parallel to a nonreasonable discussion either..
To your previous post, if we discuss language, we will use terms related to languages and linguistics. I can't avoid that.

Can you elaborate how are people, who are able to read Chinese affected more than people who use Latin script? This is the difference for them: Current system, system in the proposal. In what scenario would Chinese read the the Latin title and convert it to Chinese, if it's in the game already? That's why Chinese isn't the highest priority. They are actually affected the least of all players by that, because they don't need to read the Romanised title/artist.

Mishima Yurara wrote:

it really doesnt make it any harder to read the title/artist with separated syllables so i dont see the harm in staying consistent with other platforms alongside making it easier for chinese people to..... read their own language L
Where's the basis for that? It does make it harder for the reasons mentioned already. In both the proposal and several of these posts. How do Chinese people read it easier, if the text doesn't change for them at all?
VINXIS
ive only seen Romanized Chinese separated by syllables everywhere and not by phrases or any other way
Nyquill
uh.

at any rate, can we agree to clarify what word-by-word means in the proposal and give examples for what to do and what not to do? I did a quick google for the phrase "word by word" and couldn't figure out what it means so...
Fycho
@Nyquill,it means Each character must be romanised into a single, capitalised, separated word. Refer to this thread for examples and supplementary information.

Also, let's keep the discussion in a healthy direction and stop any personal attack public or privately. Anything that doesn't help the discussion would be removed from now.
VINXIS
can we not have the Mapper decide if the source should be romanized or not or have 2 sections for source (romanized and original) because not giving that to the hands of the mapper would be more consistent

(id personally say to just keep things UnRomanized but thats me)

i get its unicode and it can hold anything but i think thats on the basis moreso that some sources are officially in english and some sources are officially in maldivian and not really because of our choice of language transliteration
Topic Starter
Okoratu
what the fuck is this thread now????
Spit at each other elsewhere holy fuc like i have read Wafu's posts like twice by now and i still have no clue what he's saying because i dont get half the words and im not even half bad at english lol, idk how you want to argue about anything if half a psot is about semantics in statements of people that don't speak as accurate english as you or whatever?

anyways this is lol so here's my thoughts on some of the points:
@romanized source: i think that was requested years ago but nothing happened on that front yet so lol
@nyquill: word by word method should then just be char by char or whatever where each character is romanised individually
@Wafu your argument is retarded because it implies they dont use beatmap search to find maps on the website which dominantly lists romanised fields only.
so in any case people that know the language are put on the same level playing field as anyone else.
@Sieg thx for summary i'll do the changes~

Can you people be less dumb when debating about this? like really this hurts to read because it's just so stupid. You arguing about whether or not people have priority or whatever just seems fucking racist in both ways so please stop. People that speak the language and have an intuitive understanding of it should be able to understand what a title means by reading how it's pronounced and everyone else is supposed to understand how it's pronounced

as far as i can see languages relying on phonetics always have that problem where short things can stand for many other things and there's nothing really to do about this, anything using an alternate alphabet to latin script maybe does so for a reason
Nevo

Shiguma wrote:

I believe that TV size cuts of songs should have the (TV Size) label on them, regardless of the official source. My reasoning for this is, when you search up a song, having the (TV Size) in the metadata won't affect searching for that song, while also making it very clear that whichever set you are looking at is the short version of a song. If we're bringing common sense into metadata, I don't see why we shouldn't do this.
Well I can understand the logic with this however I feel we should stick to the official metadata because, well, it's the official metadata. Seeing if a map is the Tv size/short version shouldn't be to hard for the majority of people. Since things like ~Anime-Ban~ , TV edit. , (short ver.) should make it pretty obvious its the short version of the song. I don't think we should add things like (TV Size) to songs from shows that don't officially differentiate the short version from the full version as it's not official.
melloe
First, some insignificant and unstructured observations, thoughts: for organization and perspective. Important-er stuff later.

Firstly, regarding priority, it should be said that westerners/non-Chinese speakers should ostensibly enjoy priority when it comes to this decision. Romanization is for their benefit, because they're unable to read Chinese. But it really depends on how reliant Chinese speakers are on romanization, because I don't know. What settings do most of them use on osu? How do they navigate on the website? I really don't know, feel free to provide enlightenment on this subject.

Secondly, as an English speaker (being ethnically Chinese, I learned Chinese when younger, and have since forgotten it, but I still retain a basical grammatical and conceptual foundation of the language), it is much easier for me to remember romanized titles if different syllables are grouped together into words. Although Chinese and English are both polysyllabic, English is the only VISUALLY polysyllabic language, as we group syllables together into words and, most importantly, separate those words using spaces. Chinese, to my knowledge, generally does not. Having English as a first language has geared my brain towards taking into account spatial grouping when processing language, so I take each isolated group of letters as its own discrete entity and allocate it its own semantic (or, in the absence of fluency in Chinese, quasi-semantic) space and recognize it as such. If what I've said is a little obscurely phrased, then please just take it as testimony from an English speaker that Tushuguan is unequivocally easier to memorize than Tu Shu Guan, and I don't think my threadbare knowledge of Chinese contributes to that at all. Faced with a title such as "Gei Wo Yi Ge Li You Wang Ji" I would quickly become discouraged and not even try to memorize it, except maybe after numerous plays. I'd sooner type in the mapper's name and click through the options presented to me.

Wafu wrote:

2. As for the memory point, again, you are considering this point from the Chinese speaker perspective. That's not the target group. As above, it's about how Latin script works with words. As you probably know, when people who use Latin script read longer words, they generally don't read them, they just recognize it by the shape of the word. Because of that, they will also miss minor spelling errors, because they read the originally intended word by the shape. That suggests (which is a fact by the way) that they memorize text (that is seemingly a word) much easier than syllables. As an example, you probably have the shape of "Romanisation" memorized pretty well. That means if I'd misspell it to "Ronamisation", you would quite likely not notice that. Whereas if I did "Ro Na Mi Sa Ti On", you would more likely notice the error, because you would read it syllable by syllable.
Thirdly, to address the problems of grouping together romanized Chinese syllables into words. It is true that in grouping together syllables there is a lot of ambiguity, but much of that ambiguity should be able to be cleared in context. For instance, taking this charming example provided to us:

Hollow Wings wrote:

"Gu Niang, Shui Jiao Yi Wan Duo Shao Qian?"
this sentence mainly has two meanings:
1. "Hey gril, how much it costs if i buy a bowl of your dumplings?" (姑娘,水饺一碗多少钱?)
2. "Hey gril, how much it costs if i sleep you one night?" (姑娘,睡觉一晚多少钱?)
Context should be able to very easily clear up such ambiguities. What is the song about? What is the rest of the song saying? Context will provide an almost effortless resolution to such conclusions, which I imagine would comprise the vast majority of such instances.
However, some of those ambiguities will be purposely rendered in the form of puns etc., such as here:

Fycho wrote:

For example, specific examples like "他谁都打不过", it's used intentionally to represent two meanings that are "Nobody can beat him" and "He can beat everybody", "Ta / Shui / Dou Da Bu Guo" and "Ta / Shui Dou Da Bu Guo".
These will most likely make up such a negligible percentage of these instances of ambiguity that to go through with the proposed changes and deal with these intentionally ambiguous titles as they come up would not be completely remiss -- but I personally believe that even these hypothetical cases, however rare, should be considered before pushing any changes. That is just my opinion, ultimately it's not up to me.

Fourthly, about "v" vs "u." To Chinese speakers of course "v" makes the most sense, as that is the input they use in their everyday lives, but to the western audience, "v" will make absolutely no sense. "u" and "yu" are both inadequate romanizations of "ü," because "yu" will be pronounced "yoo" by most westerners, but "v" will be next to useless for everybody except for Chinese players. "v" is more ambitious in that it serves to correctly represent a specific sound instead of simply approximating it, but for western osu players it is completely counterproductive.

Fifth, Japanese kanji and Chinese characters are not the same. With kanji this is a non-issue; each kanji does not have its own syllable. Sometimes a word consisting of two kanji will have a three-syllable pronunciation, and a kanji itself can have multiple pronunciations depending on the word that comprises it. Splitting up each character into a single capitalized word is not even possible, so there's no point in comparing them.

Lastly, Chinese is generally referred to as logographic rather then ideographic, as a character represents a morpheme rather than a more nebulous concept, and as ideogram usually refers specifically to a symbol that is independent of any corresponding sound--although of course no logographic writing system is without a phonetic component built into it. The terms themselves are rather fuzzy anyways, so to achieve anything of actual accuracy one has to resort to such ungainly terms as HW's "ideophonographical." However, to call Chinese logographic is not incorrect. In fact, most people, even linguists, do it.


To the crux of the issue.


The real dichotomy here is between practicality and officiality/aesthetics. That is a highly subjective discussion and is conducive to many (as seen here) tetchy discussions. Grouping words together will almost certainly make it more convenient for non-Chinese speakers, there should really be no question about this. I personally don't even pay attention to the name of a Chinese map if it's over three or four characters long; the profusion of capitals and spacing, to my English-speaking mind, is simply inconvenient, and I would rather memorize the mapper's name, the artist's name, and the background instead. Japanese titles, meanwhile, are multisyllabic, and I would rather have a few multisyllabic words than six monosyllabic words. How closely we adhere to "ISO 7098" really should not be a question. We're a small international circle-clicking community, not an official international organization, so shouldn't we rather consider things from a functional, practical perspective?

Of course, such a change would have its downsides, and I suspect that the main, unvoiced (if I may be so presumptuous) gripe that so many Chinese speakers have with this proposal is largely aesthetic. The elegance of the Chinese language lies precisely in the symmetry and ambiguity that this proposal will do away with. In Chinese each character is given equal spatial heft, and to consolidate multiple words would rob them both of their spatial importance as well as the importance that a capitalized letter lends them. In short, when comparing "Wei Lai Shi" to "Weilaishi," Weilaishi to the sensibility of the Chinese speaker (and even to mine partly) seems ugly, wrong, amateurish, and not at all official. Similarly, the troubling part of the inconsistency of word-division romanization having no "stable standard" as HW put it--the troubling part is not that this inconsistency is practically unfeasible, but that inconsistency is aesthetically unappealing. It is not of "official" quality.

Believe me, when it comes to officiality people will often be perfectionist, especially when they have a say in the matter. Why are there so many rhythm game elitists that condescend on osu? Because other rhythm games, with their shinier interfaces and their licensed songs, are more "official." Why is rankability not centered around actual merit, but only flawlessness; why are people so often concerned about whether a certain controversial map enters the ranked section, even if it doesn't affect them? Because the ranked section is the "official" section of the game, and people are perfectionist about it. These are not practical attitudes, but aesthetic ones, and so it is here too, I think. Why should the Chinese not be concerned with how their language is rendered to other people? I, too, would be bothered.

Of course, an aesthetic claim is not as defensible than a practical one, so other, more practical-sounding arguments are resorted to (perhaps subconsciously), but to me these arguments are ultimately immaterial. Practically speaking, word-division is far more useful than syllabic division--the rare ambiguity can be cleared simply by referring to the musical/lyrical context, and the even rarer intentional ambiguity (puns, etc.) can be left simply as single-syllable words, as with the status quo. And yet, due to my aesthetic sensibilities, I prefer the status quo; that is my personal opinion. I can and have been making do with mapper/artist name and background to search out the maps I need.

And of course I haven't even addressed the question in the case that Chinese players do actually rely heavily on romanized titles in osu. If they do, and it is easier for them to have a one-word-per-syllable romanization, then even less of a reason to change.


Lastly,


and off-topic, I would like to say that it is very easy to judge others, and parse their words and find their flaws, but difficult to do the same to yourself. The habit is to be severe towards others but generous only towards yourself and others like you/close to you. This makes it not just possible, but very often for someone to, with one breath, send a rude message to someone and, in the next breath, accuse them of being condescending. Similarly, it makes it possible for someone to accuse someone of sending them a rude message and, in the very next paragraph, act in a supercilious and condescending manner, and throw passive jabs towards their life/background and incivility, and accuse them of barbarity.

https://plato.stanford.edu/entries/nietzsche/#PoweLife
Nietzsche's metaphysical (mostly applicable psychologically) doctrine of the Will to Power is the idea that humans primary pursuit is towards that which will increase their power in relation to others. You can see this surface most often not in the large, sweeping motions of world politics, but in the minutia of everyday conversation and discourse between people who are, shall we say, less than friends (and sometimes even amongst friends as well.) Couple the Will to Power with this quote from Fyodor Dostoevsky: “Lying to ourselves is more deeply ingrained than lying to others,” and you have 95% of society in two short ideas.

So people will resort to silly antics to inflate their sense of power in relation to others, to deft manipulations of truth and to strawmanning and to posturing/boasting. Someone posted a long essay? Let me post an even LONGER essay with even BIGGER words, otherwise they and others might think they are right and I am not (so people have accused Wafu, and maybe they'll accuse me of it too). Someone is using such self-assured language that a tiny part of me thinks he might be right? Let me post this incriminating screenshot of him, or tweet about it so people will agree with me and I will be more assured that I am in the right, and he in the wrong. Someone said I have no priority? Let me capitalize on his poor phrasing rather than consider his words generously and in context, and not even consider that he may simply have worded his thoughts more hostilely than he intended to. Someone is upsetting me with his word choice? Let me throw in the words "arrogant," "fallacious," "non-sense," "barbarian," (just some words I have picked from posts on both sides) and whereas they are in the wrong if they use it, I am not.

This is also why many debates I've witnessed offshoot into various side unrelated directions, in an effort to prove the opposition wrong about anything at all. It's why people will carry on a debate for so, so long, and put so much effort into it, because to lose or even to not reply is to be lowered in power/status. It's why people will resort to strawmanning and ad-hominems, and why people will pick out the weakest arguments on the other side and take those apart while ignoring everything else. Really, if you go around for a week or a month with the idea of the will to power in the back of your mind, just observing (especially on the internet), many many things will become apparent to you.

Yes, it's so easy to look at another person's argument and see it in the worst light possible. Wafu said, "You, as Chinese have no priority in this matter, just because it's about Chinese"? Maybe he only means that he believes the purpose of romanization in osu is to help players not familiar with Chinese to navigate through Chinese song titles, and in light of this it is with extra consideration towards non-Chinese speakers that we should think about this metadata proposal. Even if you disagree, is that so unreasonable a proposition? Correctly interpreted, it certainly cannot be the "most hilarious thing i've saw this day."

And the exact opposite opinion, that "Chinese people have the exact highest priority in this matter, just because it's about Chinese," is not so unreasonable either. As I've said above, why should Chinese players not be concerned with how their language is rendered to other people, and on a website and game that they themselves frequent and have been part of for years? Why would they not be upset if someone else tries to push changes that they dislike, concerning a language that they have spoken for years? And that, too, makes sense.

Before anybody accuses me of being a hypocrite, yes, I am a hypocrite, just like anyone else I do these things often. And you can fun of me for this long post if you want. In the end, all I want to say is, not just here, but in mapping and modding and even in life, be generous not just to yourself but also to others, even if you don't like them.
_handholding
After reading oko’s (and a few others) annoyingly frustrating post it was such a pleasure going through yours melloe. Thank you for your input
Nyquill
This thread is impossible to dissect, so I'm just gonna post my two cents about the actual topic instead of whether or not who has priority.

The first thing we learned in elementary school in China was how to romanize. Even for people who has mandarin as their first language first learned to romanize via pinyin before anything else. This might sound weird but pinyin is the most intuitive way to understand how to pronounce words without knowing what any characters mean.

What is the issue with joining syllables in romanizations? Well...

For starters, compounds are very loosely defined in mandarin despite being littered with them. In Japanese, you can compound kanji and they would have the hanzi reading as a result. Because of the literal different reading, it's intuitive to compound the romanization as well.

Does joining syllables make words easier to read for people who speak english? Its hard to say. Personally, I find a jumble of pinyin equally as nonsensical as each of them individually without intonation.

It's worth mentioning that even the people who establish and practice these standards have trouble determining when to compound and when not to, and thus provide alternate romanizations:


From the US library of congress romanization guidelines

I think it would be the most consistent if we retain the word-by-word romanization method. There's no way anyone would mess up if we do. If you were to learn Chinese, your school would most likely teach you pinyin using the word-by-word method, so that's a plus. Then again, I haven't learned Chinese in well over 15 years, so take that as you will.

Alternatively, we can also implement this document here: https://www.loc.gov/catdir/cpso/romanization/chinese.pdf

Names and titles will have joined syllables whereas everything else is separate. This is the way the American government romanizes Chinese today. In my opinion its a lot of work for very little merit, but hey, we're adhering to standards an accredited institution set out.
Wafu

Nyquill wrote:

Alternatively, we can also implement this document here: https://www.loc.gov/catdir/cpso/romaniz ... hinese.pdf
iirc, this was one of the options we were considering. It could be worth discussing this option if people want to follow the standards. It solves some of the brought up issues. I doesn't solve the "v" issue, which should be fixed regardless of keeping or changing the current system as it has no basis other than keyboard layout.
abraker

Sieg wrote:

abraker wrote:

Any thoughts about mapping style or patterns the maps have being in tags?
I don't see any restrictions for this right now as long as they are related to the set. Also don't think that this worth specific mentioning.
The reason I mention this is that some maps in 8k mania tend to be either 8k or 7k+1 and it's impossible to know until you download and check them out. Putting down "SV" or "stream" or etc would also allow the added benefit to search for types of maps in any gamemode. I feel like something should be mentioned in guidelines.
Fycho
For the TV Size thing, drop some opinions:

For example this song: https://osu.ppy.sh/s/477045
The song has a game ver that without "~TV animation ver.~", and has a TV ver later that labled with "~TV animation ver.~" to distinguish. They are different in Instrument and lyrics. In this case, (TVsize) aren't necessary but not for "~TV animation ver.~". That popular "~Anime Ban~" is pretty similar stuff.

I believe there is a metadata discretion when handling things like this.
F D Flourite
Maybe I'm not complete to read the whole thread because there are much about ignoring things. I just want to say some intuitive thoughts about the language Chinese.

First of all, still many Chinese type in English to search title of Chinese songs in osu! for the sake of consistency. Personally, I'm used to type pinyin to search for song title because osu! in the past had poor support on unromanised searching (maybe it's because many maps from 2012 and earlier only have their romanised one, both for Chinese and Japanese songs, as metadata at that time was not much forced).

melloe wrote:

Thirdly, to address the problems of grouping together romanized Chinese syllables into words. It is true that in grouping together syllables there is a lot of ambiguity, but much of that ambiguity should be able to be cleared in context. For instance, taking this charming example provided to us:

Hollow Wings wrote:

"Gu Niang, Shui Jiao Yi Wan Duo Shao Qian?"
this sentence mainly has two meanings:
1. "Hey gril, how much it costs if i buy a bowl of your dumplings?" (姑娘,水饺一碗多少钱?)
2. "Hey gril, how much it costs if i sleep you one night?" (姑娘,睡觉一晚多少钱?)
Context should be able to very easily clear up such ambiguities. What is the song about? What is the rest of the song saying? Context will provide an almost effortless resolution to such conclusions, which I imagine would comprise the vast majority of such instances.

However, some of those ambiguities will be purposely rendered in the form of puns etc., such as here:

Fycho wrote:

For example, specific examples like "他谁都打不过", it's used intentionally to represent two meanings that are "Nobody can beat him" and "He can beat everybody", "Ta / Shui / Dou Da Bu Guo" and "Ta / Shui Dou Da Bu Guo".
These will most likely make up such a negligible percentage of these instances of ambiguity that to go through with the proposed changes and deal with these intentionally ambiguous titles as they come up would not be completely remiss -- but I personally believe that even these hypothetical cases, however rare, should be considered before pushing any changes. That is just my opinion, ultimately it's not up to me.

In fact for many contemporary Chinese ballads, their titles are deliberately came up as such (in the form of puns). As for the first example given here, the song title can still be sexually suggestive even if its formal title is about dumplings. Because Chinese lyrics are not as logical as daily language,
and people just can easily get the ambiguous meaning because there is no way to distinguish their pronunciation difference in a song-wise tone without logical context. "Context will provide an almost effortless resolution to such conclusions" as you said is often not the truth. Joint of Chinese characters into a single word will often cause loss of meaning in this way. (I have more examples, one of which is my uploaded map)


Fourthly, about "v" vs "u." To Chinese speakers of course "v" makes the most sense, as that is the input they use in their everyday lives, but to the western audience, "v" will make absolutely no sense. "u" and "yu" are both inadequate romanizations of "ü," because "yu" will be pronounced "yoo" by most westerners, but "v" will be next to useless for everybody except for Chinese players. "v" is more ambitious in that it serves to correctly represent a specific sound instead of simply approximating it, but for western osu players it is completely counterproductive.

I'm not sure if you go through the HW's post thoroughly but there was an example given to prove that the change from "v" to "u" will result a worse case under certain conditions: “绿光” & “露光” will be both romanised in "Lu Guang" while their actual pronunciation are completely different. For non-Chinese speakers, I don't think it can be a better way either to pronounce it or to remember the title by any means. Ofc I understand that "v" has no connection with the actual pronunciation of "ü". I was also confused when I first used a keyboard to type Chinese. However, this is just a general knowledge for all Chinese users and Chinese learners. That's how we Chinese grow up. So even we may understand that "v" can be senseless in pronunciation manner,
I don't get why non-Chinese speakers have the advantage to ignore such knowledge (which is common to us) at all. When you want to memorize a title in a different language, accepting its small piece of rule/regularity (actually it's really small) is not demanding is it? In fact for the pronunciation of Japanese romanised way of "ra" (similarly, ri, ru, re, ro), the actual pronunciation is far from /ra/, but somehow similar to be in the middle of /ra/ and /la/. Personally I'd even say it sounds much closer to /la/ in general. But when you have to memorize it, you simply accept its setting of being forced "ra". That's the same thing.


Lastly, Chinese is generally referred to as logographic rather then ideographic, as a character represents a morphheme rather than a more nebulous concept, and as ideogram usually refers specifically to a symbol that is independent of any corresponding sound--although of course no logographic writing system is without a phonetic component built into it. The terms themselves are rather fuzzy anyways, so to achieve anything of actual accuracy one has to resort to such ungainly terms as HW's "ideophonographical." However, to call Chinese logographic is not incorrect. In fact, most people, even linguists, do it.

I don't know how you call Chinese logographic so steadily so I just want TRUE evidence. And I don't even want to read Wafu's post again because he was simply doing this once and once again without compelling support. Anyways, the most intuitive thoughts of the language Chinese is still ideographically, based on how we accept Chinese education for more than 12 years. Many words that combined by two or more characters are also generated by the joint of meanings of those characters together. For example, “未来”(future) can be split as “未”(not happening) and “来”(come). And the easy joint would be "has not come yet", which is the close meaning of "future". And the word “银行”(bank) can be split as “银”(silver, which is the general currency in ancient China) and “行” (an organization/commercial firm focusing on specific fields, pronounced as Hang). And it's obvious that the joint of those two meanings an organization/commercial firm focusing on money, which is bank.

The third example would be my own map https://osu.ppy.sh/s/598869 “花儿纳吉” (The actual correct pronunciation should be Hua Er Na Zei, which is different from normal Mandarin pronunciation Hua Er Na Ji). This title has no direct meaning from Mandarin as it's from minority Chinese language (Qiang language). The official meaning is "Being happy like a flower". However, the song title still has its similar meaning to the combination of Mandarin in Chinese culture , which was also part of intention by the song author: “花儿” is flower, “纳” is containing/accepting, “吉” is happiness. If being wrongly considered as logographic, the song title would be less valuable, which is what we cannot accept. There are just thousands of more examples so I have to stop here.

As a result, I completely don't understand why you guys keep trying to call Chinese logographic by any means. It's highly COUNTER-INTUITIVE. And in fact the change of combining characters is highly impractical (as you wanted to state below the opposite way) in this way, because simply consider each character as pronunciation (as logographic indicates) will result in MEANING LOSS and CULTURE LOSS, which is definitely a wrong way to approach to Chinese language.


To the crux of the issue.


The real dichotomy here is between practicality and officiality/aesthetics. That is a highly subjective discussion and is conducive to many (as seen here) tetchy discussions. Grouping words together will almost certainly make it more convenient for non-Chinese speakers, there should really be no question about this. I personally don't even pay attention to the name of a Chinese map if it's over three or four characters long; the profusion of capitals and spacing, to my English-speaking mind, is simply inconvenient, and I would rather memorize the mapper's name, the artist's name, and the background instead. Japanese titles, meanwhile, are multisyllabic, and I would rather have a few multisyllabic words than six monosyllabic words. How closely we adhere to "ISO 7098" really should not be a question. We're a small international circle-clicking community, not an official international organization, so shouldn't we rather consider things from a functional, practical perspective?

Sorry but I just think the way of changing is even more impractical for the reason stated above
Context after here is not holding new idea so I delete them in my post. But anyways, I'm completely not convinced how changes on Mandarin/Chinese metadata would help it be more practical. On the contrary, they're ignoring the general case of Chinese and making things even worse.
Shad0w1and

CrystilonZ wrote:

Shad0w1and wrote:

So let's face the reality, there isn't a standard for Chinese romanization into ANSI code. I can't understand that without a commonly accepted standard, why would you guys try to change the current metadata rule?
We've expressed (thoroughly I believe) what problems the current system has. Please read all the previous points made in this discussion.
no, you don't understand, using nonsense metadata will fuck up all Chinese players and all Chinese learner players. Wtf are you considering making a nonsense international standard and let all the players think wtf is the osu meta?

and if you don't understand why almost all Chinese opposed to this proposal, I am telling you because it will fuck up almost everyone who actually know mandarin. We don't want that happen. No matter its the lv problem or ISO noun problem, they make no sense to all Chinese learners and Chinese players, this will make everyone struggle to search for songs. And you cannot just make an osu news post saying we changed the Chinese meta because we redefined an international standard !!!

also, romanize is not for english speakers to be able to read words. Lmao each semester my professors (in the US) are struggling to read students names from Germany, Ireland, India, Russia and tons of other countries. Even you romanized their names from the original language, it does NOT mean you can pronounce it. And Lv, Nv is the same case as Ra, Ri, Ru, Re, Ro in Japanese and similar cases in all other languages. You can't expect people who do not know the language to pronounce it correctly. The romanized meta, in this sense, it a way for learners to deal with the song searching.
F D Flourite
And it seems like I have to open another post for Wafu

Wafu wrote:

1. First of all, you did use the ISO document as your argument, but you didn't even know that the citation about "ideophonograph" language was just confirming what CrystilonZ posted. You agree with ISO on the same thing that you disagree with CrystilonZ on. They state the same thing.
We illustrate many reasons why each Chinese character has its own meaning and such meta would be ignored when words are joint together. And you're just repeating "same thing", "it's not non-sense", "you're ignoring what I am saying". It may be useful for one time but not for many because people don't see you're supporting ideas. You are just repeating yourself.

Wafu wrote:

Yes, I agree with that point. Some Chinese characters indeed do use "pictographic and ideographic features". You even quoted me saying that. That doesn't make the language pictographic or ideographic, because even the characters with pictographic or ideographic features are logograms. That makes the language logographic. Why do you call something non-sense and then say the same thing?
Another repeat. First of all, please provide good evidence/support to "That makes the language logographic" or you're just repeating your own words. That would lose the ground where you try to stand on. Secondly, when you accept a majority part of the language has "pictographic and ideographic features", you don't accept the fact that when combining words together such feature will be lost and it can be highly detrimental to the language meta. I just don't get it.

Wafu wrote:

7. If you have problem with me comparing how osu! works for 2 different Romanisation, I think there's a different problem. Stop calling me ignorant if you ignore what I've even written in that paragraph. It's also not non-sense. I literally just say how people work. How can that be non-sense? That is an observation.
The new system you are trying to bring out has huge difference than Romanisation we've had, as HW, Fycho and I illustrated lots of evidence and facts ("五环境内","他谁都打不过","花儿纳吉")So seeing how people work in the past doesn't mean you're qualified enough to judge romanisation of Chinese correctly. In fact I don't think any automatic system could handle such romanisation correctly. In these cases native Chinese speakers still have louder voice.

Wafu wrote:

2. No, "v" doesn't work the best, it doesn't work at all because it has no linguistic basis. That doesn't mean "u" is the best, although we agreed that it generally won't make difference for a regular player, there are still many options that can be considered, but it can't be "v", and probably not "y", because that's associated with a different sound (even in other Romanisation systems we use). "u" pronounced in a certain way will result in the "ü" we are going for, it really is the core sound of it, I described this in the post 2 times already, so I guess I don't have to repeat myself.
I've already explained in my previous post that using "u" will cause terrible ambiguity. You insist on pronunciation of "v" is different from what we want, while you ignore the fact that ALL OTHER alphabets have their own pronunciation function in Chinese pinyin. The "v" is the most convenient one when you have to find a new thing corresponding to "ü". And again, it's just a language setting. That's how it's used for decades. You non-Chinese speakers should not have the advantage to ignore that. After all it's not demanding for people to remember such a small thing if they want to approach to Chinese. Pronunciation of "v" should not become the barrier of knowing any of Chinese, or they're determined to fail to learn it anyways.

Wafu wrote:

Third point, not sure why you are personally attacking me. How do you know what my education is, what my job is, what my real life is? You don't know any single thing about my personal life, so don't act like you do.
Answer:

Wafu wrote:

but remember it works vice-versa.

Wafu wrote:

Your false accusations (of us not reading stuff or not being professional) did, indeed, make me send you this message (and it is called exactly that: "Private message"). I'm not making fun of you as it was not public, you making it public doesn't mean I'm making fun of you. I wanted you to know that putting this down to "there's no research" was unfair of you, as you didn't invest your time into the research either. Was I being rude to you in the private message? Yes, as as you were when you clearly did, intentionally ridicule the proposal, except I at least could keep it private.
Last but not least, insulting others in private message doesn't mean that you were polite at all. It only means you pretend to be polite but failed and you wanted to hide the fact that you were not. So please learn to stay calm and polite consistently.
----------------------------------
Just an observation: No native Chinese speakers ever try to support such changes. You're trying to prove that it's easier to memorize and to pronounce for non-Chinese speakers, but we have tried super hard to prove that you haven't gone through Chinese and there are tons of fact that counters your idea of making the romanised result to audience easier and better. But I don't see your reaction of ever acknowledging that, which is very disappointing in a discussion. The changes literally try to change the Chinese language into something that Chinese speakers don't know, while you are completely indifferent about it. So do not say "conservative" again when you cannot find any other reasons that why all Chinese speakers disagree. After all, it still sounds ridiculous if no Chinese agree on Chinese metadata changes anyway.
Monstrata
If you guys could summarize what rule you want to change/add, that would speed up the process here a lot. Something like In the case of romanizing ü, use v, not yu. Or something like that. This is just an example btw. I disagree with using "v" as romanization since English speakers will pronounce "v" differently from how it's supposed to be pronounced in Mandarin.

I would like to add a few extra rules to the proposal, taking into account other languages that so far haven't been discussed, since everyone's been caught up on the Chinese romanization debate.

With respects to Korean romanization, I'm wondering if we should continue applying the McCune-Reischauer system for romanizing Korean. This is the system that the Library of Congress is using. Nyquill brought up an excellent point about using romanization systems that other large institutions are currently using and it works a lot better than creating our own modified system in most cases (unless we are simplifying).

I'm bringing this up because there is also the Revised Romanization of Hangeul system that was introduced on July 7th, 2000 which has been applied to various Korean road signs transportations etc... The major change of course being that the new system eliminates diacritics in favor of digraphs.

A possible rule would look like:

Songs with Korean metadata must be romanised using the McCune-Reischauer system for romanizing Korean when there is no romanisation or translation information listed by a reputable source. The same applies to the Source field if a romanised Source is preferred by the mapper.

Additionally, we could introduce the use of digraphs and two-vowel letters into the proposal:

Vowels /ʌ/ and ㅡ/ɯ/ should be written as digraphs in Korean romanization, and romanized to eo and eu respectively.

Another language to examine is Thai. The Library of Congress recommends nine additional rules for Thai romanization which are:

Library of Congress wrote:

Romanization
1. Tonal marks are not romanized.
2. The symbol ฯ indicates omission and is shown in romanization by “ … ” the conventional sign for
ellipsis.
3. When the repeat symbol ๆ is used, the syllable is repeated in romanization.
4. The symbol ฯลฯ is romanized Ia.
5. Thai consonants are sometimes purely consonantal and sometimes followed by an inherent vowel
romanized o, a, or ǭ depending on the pronunciation as determined from an authoritative
dictionary, such as the Royal Institute's latest edition (1999).
6. Silent consonants, with their accompanying vowels, if any, are not romanized.
7. When the pronunciation requires one consonant to serve a double function – at the end of
one syllable and the beginning of the next – it is romanized twice according to the
respective values.
8. The numerals are: ๐ (0), ๑ (1), ๒ (2), ๓ (3), ๔ (4), ๕ (5), ๖ (6), ๗ (7), ๘ (8), and ๙ (9).
9. In Thai, words are not written separately. In romanization, however, text is divided into words
according to the guidelines provided in Word Division below.
My question for Thai romanization is whether we should treat them similarly to how we are treating Chinese romanization which is to separate words with spaces, or if we should clump them together, for example: พระนางเจาพระบรมราชินีนาถ romanized as: Phranāng Čhao Phrabǭrommarāchinī Nāt or Phranāngchaophrabǭrommarāchinīnāt. Also, should we make all the separated words uppercase, or only the first? Since Thai chains everything together, there is no indicator for upper and lower case when we split the phrase up (if we do).

The two rules I am proposing are:

Songs with Thai metadata must be romanised using the Library of Congress system (also known as ISO 11940) for romanizing Thai when there is no romanisation or translation information listed by a reputable source. The same applies to the Source field if a romanised Source is preferred by the mapper.

and

In the romanization of Thai, words should be romanized separately, and separated by a space. Additionally, all words should (or should not?) be uppercased.

Attached are helpful transcription keys for Thai:






Another language that is becoming more and more relevant is Arabic, and there are some issues I would like to bring forth with regards to its romanization.

Here is the table for romanization of Arabic:


As you can see, some issues come up. In the romanization of ص ص ص ص for example, (whether initial, Medial, Final, or Alone) the romanization becomes " ṣ" however, the diacritical mark is not something that can be used by osu because it is still not unicode. I would like to propose that all of these diacritical "," attached to letters be removed for the sake of simplicity and because osu currently does not support them. Therefore something like " ص◌نضوِ◌خ" should be romanized as "sandwich".

Another problem with Arabic is that it is typed in reverse, right to left. Should we also apply this to romanization? In this case "ص◌نضوِ◌خ" would actually be romanized as "hciwdnas" when read left to right as English readers are expected to do.

The rule I am proposing is:

Songs with Arabic metadata must be romanised using the Library of Congress system for romanizing Arabic when there is no romanisation or translation information listed by a reputable source. The same applies to the Source field if a romanised Source is preferred by the mapper.

Additionally:

In the romanization of Arabic, words should be romanized in verse order, and the last letter should be be uppercased. For example in romanizing "◌ س◌ !" the correct romanization should be "!usO"

However, there is also the problem of Judeo-Arabic romanization which differs slightly from traditional Arabic romanization. Judeo-Arabic of course, stems from the Jewish Arabs many who live in Iraq and have adopted a slightly different script with respect to certain nouns and verbs. The most common Jewish Arabs are those from Baghdad. Anyways, I digress.

Attached are examples of Judei-Arabic romanziation:


So I would like to propose the following:

Songs with Judeo-Arabic metadata must be romanised using the Library of Congress system for romanizing Judeo-Arabic where Judeo-Arabic nouns and verbs are being used, and where there is no romanisation or translation information listed by a reputable source. Where Judeo-Arabic words and phrases are not used, traditional Arabic romanization will apply. The same applies to the Source field if a romanised Source is preferred by the mapper.

Yet another language I would like to cover is the Cherokee language, also known as the Tsalagi Gawonihisdi, which is an Iroquoian language of the native CHerokee people to which there are approximately 300,000 tribal members. In terms of syllabary, I again lean to the ALA-LC Romanization table that was prepared by the Library of Congress attached here:



Because the language does not use capitalization, I am wondering if a rule should be made to force lower case on all songs, titles, artist, and sources with Cherokee origins. Below I propose the two following rules:

Songs with Cherokee metadata must be romanised using the syllabary provided by the ALA-LC Library of Congress system for romanizing Cherokee when there is no romanisation or translation information listed by a reputable source. The same applies to the Source field if a romanised Source is preferred by the mapper.

As well as:

Songs with Cherokee metadata must use lower case across Title, Romanized Title, Artist, Romanized Artist, and Source.

Lastly, I would like to bring attention to another pictographic language. In fact, Chinese is not the only pictographic language left in the world. I'm sorry, Hollow Wings, Chinese is special, but it is not that special. You guys have forgotten about the ancient Egyptian Hieroglyphics. Such a shame.

Below is a chart on monoliteral hieroglyphs and their hieratic equivalents as researched by R. Lepsius in his book Denkmäler aus Aegypten und Aethiopien Abth. II





As you can see, not all hieroglyphs have been translated yet, and some are still in the process of being discovered due to many pyramids and ancient Egyptian pyramids currently being lost to time. Therefore I have a few set of rules to propose:

Songs with ancient Egyptian Hieroglyphic metadata must be romanised using the current knowledge of Egyptian Hieroglyphs for romanizing Egyptian Hieroglyphics when there is no romanisation or translation information listed by a reputable source. The same applies to the Source field if a romanised Source is preferred by the mapper.

Additionally, because not all Hieroglyphs have been transcribed yet:

Songs with ancient Egyptian Hieroglyphic metadata that use a hieroglyph that is currently not transcribed should be replaced with "?" until a proper transcription is decided on. The same applies to the Source field if a romanised Source is preferred by the mapper.

Lastly,

Songs with ancient Egyptian Hieroglyphic metadata that use a hieroglyph that is has not yet been documented should be sent to the American Research Center in Egypt for proper processing. This rule mainly applies to mappers who are currently on archaeological digs in Egypt and find a new pyramid and want to map the songs that were uncovered.

I hope this will be of use to you guys, and I hope to see some more fruitful and productive discussion come about.
Nao Tomori
from what i can tell, there are two main issues... first: splitting syllables 1 by 1 and second, using v for ü.

for the first one, syllable by syllable makes more sense. that is evident through the fact that creating words out of specific syllables will block other readings. it's true that it might be easier to memorize if artificial words are created but those would necessarily be arbitrary since those divisions between syllables (to create words) are not used in the actual language. at that point the romanization would be inaccurate...

second: the v shit
the point of romanization is to create a "word" in latin script that can be read by westerners. what chinese people do while texting unfortunately is not important to this discussion
v would not be pronounced as anything resembling ü by any westerner. using something that would definitely be pronounced completely wrong doesn't suit the purpose of romanization. using something like "ue" or "yu" which is (as i understand) how ü is supposed to be said makes much more sense.
Hollow Wings
i'll turn simple now.

------

CrystilonZ wrote:

To be honest I'm very pissed off right now and you have no idea how hard it is for me to post in this calm manner.


You, as Chinese have no priority in this matter, just because it's about Chinese. I believe all people here are civilised people and civilised people argue with reason.Read more about this here <Argument from authority>
i'm already pissed off since idea of "Wafu: 'You, as Chinese have no priority in this matter, just because it's about Chinese,' " came up.

HOW THE HELL YOU CAN SAY OR AGREE WITH THAT?

if any other person out of Chinese has higher priority than Chinese people ever be defined, that'll be a huge humiliating to us.

ANYONE GONNA NEVER GOT ANY HIGHER PRIORITY THAN CHINESE PEOPLE ABOUT CHINESE MATTERS.

you are saying tell me figure out what nonsense means to you, then you gonna figure out what that point of view means to us.
and i'll keep using the word "nonsense" if you keep that point of view like that.
and i'll keep calling your posts based on that points of view "nonsense" if you keep regard Chinese people like "You, as Chinese have no priority in this matter, just because it's about Chinese, ".

wanna "vice-versa"? here it is.

------------------

look at the current topic's direction, full of useless discussion about concepts or grammars, not ever contribute to it any longer.

CrystilonZ or wafu never got the most important point, even they think they know it like "i agree with you that Chinese shall be understood with context".
↑ hell no.

we Chinese ever get involved to this topic already explain things about this lots of time, but you never solve what may really trouble in any answer to that point.

the ambiguous words separating troubling situation won't just happen during romanisation, because:
Chinese characters may be written in representing various meanings on purpose.
that will effect us to saperating words.

AND I'M TIRED OF SHOWING EXAMPLES.
but i'll still give one more if it can eventually let you understand.

↓ EVRYONE NOTICE THIS EXAMPLE PLEASE BECAUSE I THINK THIS IS IMPORTANT. PLEASE. ↓
song tilte: "爱在上" → "Ai Zai Shang"
1. "爱 在 上" → "Ai Zai Shang" → "love is above (sth.)"
2. "爱 在上" → "Ai Zaishang" → love is paramount.

context:
lyric example: 天苍苍 爱在上 抬头就仰望
meaning 1: love is above the place higher than sky which is already very high.
meaning 2: love is paramount that sky can't compare to it.

"在上" is an adjective word in Chinese which means "the top/superme/etc."
"在 上" is a short sentence which can be analysed like "A在B上“ and means "A is above B"

both of those meanings are corrct answer to the original Chinese character "爱在上".
and also, both of meanings are necessary to be represented because that's what that lyric's purpose: to show how love is so great that in both physical metaphoring way and mental way.
↑ EVRYONE NOTICE THIS EXAMPLE PLEASE BECAUSE I THINK THIS IS IMPORTANT. PLEASE. ↑

now answer me to the point: how to build a transcription that can express both of those meanings?
saperated romanized words won't do it because none of them express the original Chinese characters or words or sentences correct.


you gonna build no more better transcription system if you can't solve problem like this, which is widely happened in Chinese separating work.

this is far more away from the problem if Western people or non-Chinese speakers could understand/remember/search/etc. Chinese things better or not, it's about only romanisation of Chinese itself.


and no one had answer this, but just arguing with endless useless things.'

------------------

i'll do no more examples because i've already showed enough.

the best choice of Chinese romanisation system to osu community will always be doing it in one-by-one-character method, which cause least trouble in expressing Chinese characters' meanings.

and that's what ISO did world level widely.


and that's why i wanna communite to international people with standards that had been proven or identified, to cause less drama like this.

some people just don't get it and wanan make things forward, which is really out of osu community range and already be proven that is not a better choice already.

if you gonna use some GB/T like "The Basic Rules of the Chinese Phonetic Alphabet Orthography", then i'll be pleasure to talking about other several GB/T that identified by PRC which all about romanisation.
but it will still be a bad choice to me that we can even rule this community by some single contry's standard, but not the international one which already exisist and identified.

if staffs insist creating new rules aside any of them, then so be it.
i'm not staff so i can't do anything, maybe just sign and wait for troubles occur.

------------------

if nonsense continues, then fine, the problem in this post may never be solved and there's no solid evidence to change the current transcript system for Chinese characters.

i'll ignore all of those arguments or nonsense with concepts or whatever else without standards that can reach international level from now on.
Noffy
bless naotoshi for summarizing what others took essays to say into two easy to read paragraphs
this isn't even sarcastic
bless you naotoshi

Monstrata wrote:

Songs with Korean metadata must be romanised using the McCune-Reischauer system for romanizing Korean when there is no romanisation or translation information listed by a reputable source.
May I ask why you chose the McCune Reischauer system in particular? Some quick research reveals it is currently only used officially in North Korea, with "Revised Romanization of Korean" being the current official system used for South Korea. As uhhh.... 99% of korean songs , especially those mapped in osu, come from South Korea, it would make far more sense to use that instead, overall. This may need more input from those who are more familiar with Korean.

Monstrata wrote:

Another language to examine is Thai. The Library of Congress recommends nine additional rules for Thai romanization which are:
could you please cite said documents that you mention throughout your post by linking them in case anyone wants to review it for themselves.


Monstrata wrote:

Another problem with Arabic is that it is typed in reverse, right to left. Should we also apply this to romanization?
No, roman characters are written from left to right. It would obscure the meaning and reading to write them in reverse order. Japanese can also be written from right to left, albeit in vertical lines instead of horizontal, do we romanise it backwards? no.


Overall, besides maybe, yeah we should definitely address korean as kpop is pretty popular on osu, the rest of this seems like needless bloat to the ranking criteria due to how infrequently songs in Thai, Arabic, or Cherokee, or, in Hieroglyphs, would be mapped. These should continue to be handled case-by-case and use common sense like they currently are.




Also, in general guys: while this discussion about mandarin and chinese is definitely important, please be sure to not neglect reviewing the draft itself and pointing out any other areas it could be improved. I believe both sides for the Chinese debate have at this point said anything that needs to be said to represent their viewpoint, which at this point leaves fixing out things like romanisation of ü or deciding if anything else should be considered when revising the current draft based upon these discussions.


Edit:
Additionally, please consider making tl;dr versions of your posts, this thread is nearly impossible for most people to read in its current state due to its sheer scale. It's gotten a bit out of hand.
Fycho
If saying "v" couldn't be readed by foreigners and makes misconception, then we probably need to rework the Japanese rule as ra / ri / ru / re / ro are actually pronounced as la / li / lu / le / lo in Japanese, which is kinda unfriendly towards those latin scripts users who don't know Japanese. English speakers will pronounce "ra" differently from how it's supposed to be pronounced in Japanese.

As all of us known, Modified Hepburn(Japan gov uses Kunrei) and Pinyin(China gov uses Pinyin) system are the international standard systems, people who learn Chinese will start as pinyin, and when they start learning input lately, they will know "v", and "v" is the most familiar and well-known letter for Chinese speakers and leaners. "u" messes up with the vowel "u", and "yu" would be pronounced as "yoo" or other wrong pronunciation by most English speakers. Both are not inadequate for representing "ü". If anyone has better choice than "v", feel free to advise rather than suggest useless stuffs. Otherwise we will keep the "v" for "ü".

I'll give a summary for the discussions later. (already did at https://osu.ppy.sh/community/forums/top ... rt=6554329)
Monstrata

Fycho wrote:

If saying "v" couldn't be readed by foreigners and makes misconception, then we probably need to rework the Japanese rule as ra / ri / ru / re / ro are actually pronounced as la / li / lu / le / lo in Japanese, which is kinda unfriendly towards those latin scripts users who don't know Japanese. English speakers will pronounce "ra" differently from how it's supposed to be pronounced in Japanese.

As all of us known, Modified Hepburn(Japan gov uses Kunrei) and Pinyin(China gov uses Pinyin) system are the international standard systems, people who learn Chinese will start as pinyin, and when they start learning input lately, they will know "v", and "v" is the most familiar and well-known letter for Chinese speakers and leaners. "u" messes up with the vowel "u", and "yu" would be pronounced as "yoo" or other wrong pronunciation by most English speakers. Both are not inadequate for representing "ü". If anyone has better choice than "v", feel free to advise. Otherwise we would keep the "v" for "ü".
It's not the same. R and L are pronounced almost the same way across most phonetics. V and u are way different since V is a consonant.

Ask yourself, how would you pronounce ü using english phonetics. The answer should not be "v" because that's a voiced labiodental fricative. Not a vowel.
show more
Please sign in to reply.

New reply