oh there is, if you are english speaking and map japanese songs, the source in the top left ingame is unable to be translated at all to tell english speaking people what it is from without sounding cryptic to them
That's fine for Russian, but I don't see a reason to apply replace rules from Russian to Ukrainian as it is in draft rn. As for other Cyrillic based languages - we don't write Romanisation of Hieroglyphs\Chinese so I don't see a reason why this done for Cyrillic.Okoratu wrote:
can some russians say something about http://up.kuraip.net/032209ex3724.pdf ? it seems to make sense and encompass all things saidKurai wrote:
- Cyrillic Romanisation should follow the BGN/PCGN system (except for the letter ё in Russian which should follow the GOST 2002(B) system). Read more here: http://up.kuraip.net/032209ex3724.pdf
No, some of Cyrillic based languages uses umlauts for Romanisation in BGN/PCGN.pdf wrote:
Hopefully, the BGN/PCGN systems have been built so that Cyrillic can be rendered by using only the basic letters and punctuation found on English language keyboards.
Can you elaborate why you think this is acceptable for games but e.g. not for anime series?CrystilonZ wrote:
just as third party stuff arent accepted as refs imo crunchyroll shouldn't be accepted as well i guess
for cases like games that are released in a lot of regions english names should be okay
ea. both ポケモン超不思議のダンジョン and Pokémon Super Mystery Dungeon are fine
Crunchyroll wrote:
officially-licensed content from leading Asian media producers directly to viewers translated professionally in multiple languages
Proposed Rules wrote:
- Languages in Chinese language family must be Romanised accordingly. Do not Romanise Cantonese texts with Hanyu Pinyin method of Romanisation.
- Songs with Mandarin titles and/or Mandarin artists must use the Hanyu Pinyin method of Romanisation when there is no Romanisation or translation information listed by an official source. The ü vowel should be Romanised into u and all diacritical tone marks should be omitted because of the technical limitations resulting from the limited amount of characters allowed in the Romanised title/artist fields.
- For capitalisation and word separation, refer to The Basic Rules of the Chinese Phonetic Alphabet Orthography (汉语拼音正词法基本规则/漢語拼音正詞法基本規則). In short, generally every word should be separated and capitalised. Surname and first name are separated using a space and are capitalised.
- Particles (助词) are written separately and should not be capitalised.
The difference in pronunciation of u and ü is acknowledged but Romanising a vowel into something like v is most likely not the best idea.CrystilonZ wrote:
- The one-character-one-word method is impractical. Similar to Japanese, one Chinese character does represent one single syllable. However, a word is not necessarily comprised of one syllable (like Japanese, Chinese is a polysyllabic language). For example 图书馆 (túshūguǎn) as a whole means library, and writing 'li bra ry' would defeat the purpose of Romanisation by not resembling the structure of languages using the Roman alphabet.
- Using v as the Romanisation of the vowel ü is nonsense. The purpose of Romanisation is to enable players to read titles / artist names written in scripts that are foreign to them. For anyone that does not know Mandarin and/or how pinyin works, Lv Guang (Lü Guang) is just begging to be read as Level Guang.
- The current Romanisation method is baseless and irrational considering the linguistic specifities of the Mandarin language. The current method is based on a discussion comprised of a small number of people only.
Wafu wrote:
I think it is worth doing a little comparison the current system and the system in the proposal to highlight the pros a bit more.So far, no issues that aren't solved or would have to be solved were brought up. We obviously accept your opinions, but it must be to the topic and it must be an actual issue that the system has.
- Current system
- Titles are easy read ✘ (most of people will read every syllable as if it was one word)
- Titles are easy to remember ✘ (words are easier to remember than separate syllables, humans remember the words easier by their shape)
- Fits the rules of Latin script (Romanisation = writing words from other script to Latin/Roman script) ✘ (Latin script is alphabetical, therefore separating each syllable doesn't make sense and doesn't read well for majority of Latin script)
- Fits the rules of Chinese script ✘ (Impossible, if you want to make it "fit" to the Chinese script, you would have to replace each character with one logogram, Latin alphabet doesn't have logograms. Chinese is also not syllabary script, so separating each syllable again doesn't make sense.)
- Differentiates between different Romanisations and meanings of the same sequences of characters. ✘
- Includes tones in Romanised text ✘ (Impossible with characters which we are limited to. You could use "a1", "a2" (redundant) etc., but that would make the text incomprehensible, majority of people wouldn't even know how to pronounce it)
- Doesn't replace characters with others which have no evidence of being similar to the intended character. ✘ (ü is replaced with v, which doesn't seem to be supported by any logical argument)
- You can use a different Romanisation system for dialects where the current system wouldn't work at all ✘
- Isn't related to politics ✘ (Impossible, picking any Romanisation system is picking a side, every Romanisation system is related to politics)
- Proposal
- Titles are easy read ✔
- Titles are easy to remember ✔
- Fits the rules of Latin script (Romanisation = writing words from other script to Latin/Roman script) ✔
- Fits the rules of Chinese script ✘ (Impossible, if you want to make it "fit" to the Chinese script, you would have to replace each character with one logogram, Latin alphabet doesn't have logograms.)
- Differentiates between different Romanisations and meanings of the same sequences of characters. ✔
- Includes tones in Romanised text ✘ (Impossible with characters which we are limited to. You could use "a1", "a2" (redundant) etc., but that would make the text incomprehensible, majority of people wouldn't even know how to pronounce it)
- Doesn't replace characters with others based on no evidence that they are similar to the intended character. ✔
- You can use a different Romanisation system for dialects where the current system wouldn't work at all ✔
- Isn't related to politics ✘ (Impossible, picking any Romanisation system is picking a side, every Romanisation system is related to politics)
Below are copied from my post from originally thread, this is some explanation and fact about Chinese:Proposed Rules wrote:
- Songs with Mandarin titles and/or Mandarin artists must use the Hanyu Pinyin method of Romanisation when there is no Romanisation or translation information listed by an official source. The ü vowel should be Romanised into u and all diacritical tone marks should be omitted because of the technical limitations resulting from the limited amount of characters allowed in the Romanised title/artist fields. There is a vowel that is "u", using "u" for "ü" just mess up them, currently there are not better choice other than "v". And only at very a few cases, it may use 'yu' for people name, and it's arguable, but I am not able to consider this is suitable to represent all the "ü"
Wafu wrote:
- Current system
- Titles are easy read ✘ (most of people will read every syllable as if it was one word)
- Titles are easy to remember ✘ (words are easier to remember than separate syllables, humans remember the words easier by their shape)
- Fits the rules of Latin script (Romanisation = writing words from other script to Latin/Roman script) ✘ (Latin script is alphabetical, therefore separating each syllable doesn't make sense and doesn't read well for majority of Latin script)
- Fits the rules of Chinese script ✘ (Impossible, if you want to make it "fit" to the Chinese script, you would have to replace each character with one logogram, Latin alphabet doesn't have logograms. Chinese is also not syllabary script, so separating each syllable again doesn't make sense.)
- Differentiates between different Romanisations and meanings of the same sequences of characters. ✘
- Includes tones in Romanised text ✘ (Impossible with characters which we are limited to. You could use "a1", "a2" (redundant) etc., but that would make the text incomprehensible, majority of people wouldn't even know how to pronounce it)
- Doesn't replace characters with others which have no evidence of being similar to the intended character. ✘ (ü is replaced with v, which doesn't seem to be supported by any logical argument)
- You can use a different Romanisation system for dialects where the current system wouldn't work at all ✘
- Isn't related to politics ✘ (Impossible, picking any Romanisation system is picking a side, every Romanisation system is related to politics)
- Proposal
- Titles are easy read ✔ sorry but as a Chinese speaker I don't think they are not easier to read than current seperated romanisation format
- Titles are easy to remember ✔ They are not easier to remember than current seperated romanisation from a Chinese speaker side as well
- Fits the rules of Latin script (Romanisation = writing words from other script to Latin/Roman script) ✔Latin script isn't fittable to Chinese, Pinyin system is much better from my side too
- Fits the rules of Chinese script ✘ (Impossible, if you want to make it "fit" to the Chinese script, you would have to replace each character with one logogram, Latin alphabet doesn't have logograms.)
- Differentiates between different Romanisations and meanings of the same sequences of characters. ✔explained above, for Chinese characters, Pinyin or any other romanisation system doesn't have any meaning at all, it's just a way to use as mark for Mandarin
- Includes tones in Romanised text ✘ (Impossible with characters which we are limited to. You could use "a1", "a2" (redundant) etc., but that would make the text incomprehensible, majority of people wouldn't even know how to pronounce it)
- Doesn't replace characters with others based on no evidence that they are similar to the intended character. ✔ as the pinyin or anyother romanisation/latin letters (they are just used as mark as I said)doesn't stand for meaning, this feels unnecessary
- You can use a different Romanisation system for dialects where the current system wouldn't work at all ✔ Dialects don't have an official way of writing formally, even in HK, schools teach the Mandarin grammar and write standard Chinese grammar while only use cantonese as a pronuciation. There wouldn't be songs that use dialects as song titile and artist, so they don't need to be romanised at any case.
- Isn't related to politics ✘ (Impossible, picking any Romanisation system is picking a side, every Romanisation system is related to politics)
Fycho wrote:
Below are that I disagree: (blue are my replies)Wafu wrote:
Proposal
- Titles are easy read ✔ sorry but as a Chinese speaker I don't think they are not easier to read than current seperated romanisation form As stated above they are easier for the majority of the player base and they are not harder to read for the Chinese. That means no cons just pros.
- Titles are easy to remember ✔ They are not easier to remember than current seperated romanisation from a Chinese Speaker side as well
- Fits the rules of Latin script (Romanisation = writing words from other script to Latin/Roman script) ✔Latin script isn't fittable to Chinese, Pinyin system is much better from my side too
- Fits the rules of Chinese script ✘ (Impossible, if you want to make it "fit" to the Chinese script, you would have to replace each character with one logogram, Latin alphabet doesn't have logograms.)
- Differentiates between different Romanisations and meanings of the same sequences of characters. ✔explained above, for Chinese characters, Pinyin or any other romanisation system doesn't have any meaning at all, it's just a way to use as mark for Mandarin I think the point here is about homophones and stuff. Same sequence of pronunciation might produce two (or more) different meanings. Separating Romanised titles into words should help with the comprehensibility.
- Includes tones in Romanised text ✘ (Impossible with characters which we are limited to. You could use "a1", "a2" (redundant) etc., but that would make the text incomprehensible, majority of people wouldn't even know how to pronounce it)
- Doesn't replace characters with others based on no evidence that they are similar to the intended character. ✔ as the pinyin or anyother romanisation/latin letters (they are just used as mark as I said)doesn't stand for meaning, this feels unnecessary speaking about u and v here. v is just impossible to pronounce. I'm always open for a better alternative.
- You can use a different Romanisation system for dialects where the current system wouldn't work at all ✔ This is only about Mandarin, dialects are not included. (Cantonese, Wu-Chinese(Shanghainese, Suzhou Hua, Wuxi , HangZhou), Jiang–Huai Mandarin, Southern Fujian Dialect, Hakka Dialect, etc don't need to be discussed for now this is speaking about the new rule. Languages in Chinese language family must be Romanised accordingly. This opens room for other Chinese languages to use suitable Romanisation systems, not restricting all Chinese languages to one bad system which may or may not fit the language.
- Isn't related to politics ✘ (Impossible, picking any Romanisation system is picking a side, every Romanisation system is related to politics)
Romanisation provides no meanings, both "Wei Lai Shi" and "Weilaishi" are just a mark of "未来式"(logograms), in the meaning, there is no different between "Wei Lai Shi" and "Weilaishi". "Weilaishi" doesn't help the comprehensibility for both people who speak Chinese and don't speak Chinese. For Chinese people needs to spend time switching them to characters, for non-Chinese people, it's just mark/pronunciation of "未来式", they don't have the meaning. They could use "Wei Lai Shi" to search, they don't speak Chinese how can they know "Weilaishi" is a whole word?CrystilonZ wrote:
I think the point here is about homophones and stuff. Same sequence of pronunciation might produce two (or more) different meanings. Separating Romanised titles into words should help with the comprehensibility.
↑i don't know if CrystilonZ know the whole Chinese language family clear enough, so i'll add some additional things as basic background knowledge here.CrystilonZ wrote:
Other languages that use the Chinese script are irrelevant to this proposal.
We are only talking about Standard Mandarin here and Mandarin is not equivalent to Chinese.
We only use 'Chinese' in the draft for simplicity. The wording will be changed if this is implemented.
Chinese is far different from Japanese. the syllable thing you are talking about may be just the differences between Japanese's Hiragana or Katagana, but not that true for Kanji part.CrystilonZ wrote:
Similar to Japanese, one Chinese character does represent one single syllable. However, a word is not necessarily comprised of one syllable (like Japanese, Chinese is a polysyllabic language).For example 图书馆 (túshūguǎn) as a whole means library, and writing 'li bra ry' would defeat the purpose of Romanisation by not resembling the structure of languages using the Roman alphabet.
Proposed Rules wrote:
The ü vowel should be Romanised into u and all diacritical tone marks should be omitted because of the technical limitations resulting from the limited amount of characters allowed in the Romanised title/artist fields.
Please understand first, if you want to change the current rule, namely from ü to u, you have to prove yourself FIRST u is a better choice than v, instead of announcing you are going to change it to “u” while asking us to provide a better choice. There are plenty of letters and characters could be chosen, why you chose u? Just because they look similar after omitting the you called “diacritical tone mark”? I don’t think that is a reliable reason for this change as only judging by visual appearance is pretty unprofessional when talking about romanization. Additionally, Fycho has already mentioned the potential mess that might result from changing v to u, indicating that this entry within the proposal is not only pros. Therefore, prior to this discussion, you should not simply saying “The ü vowel should be Romanised into u…” and explain this change only by why “ü” cannot be implemented by the current system due to technical difficulties but to explain why “u” is better than “v” with valid reason, ( “u” can be pronounced is not a valid reason: there are many characters that could be pronounced, like a e I o and some bi-characters like yu, which is mentioned by Fycho. All of them have pros and cons, why do you gave preference to u in thisdraft?), as well as how you are going to address potential problems if this “u” proposal is implemented.CrystilonZ wrote:
speaking about u and v here. v is just impossible to pronounce. I'm always open for a better alternative.
I don’t think with the proposal, titles are easier to read and remember.Previous Discussion wrote:
- Current system
- Titles are easy read ✘ (most of people will read every syllable as if it was one word)
- Titles are easy to remember ✘ (words are easier to remember than separate syllables, humans remember the words easier by their shape)
- Proposal
- Titles are easy read ✔
- Titles are easy to remember ✔
I think the current Romanizing method is more informative. At least, they are equal regarding informative from the Latin language-wide. Let me raise an example:Previous Discussion wrote:
For my thoughts on the matter I don't understand how romanising 学不会 to Xue Bu Hui is more informative than Xue Buhui or Xuebuhui. Word separation can be ambiguous at times but whether you write Xue Buhui or Xuebuhui it's more informative than the Xue Bu Hui according to the current RC.
These statements are also problematic.Previous Discussion wrote:
- Fits the rules of Chinese script ✘ (Impossible, if you want to make it "fit" to the Chinese script, you would have to replace each character with one logogram, Latin alphabet doesn't have logograms.)
- Differentiates between different Romanisations and meanings of the same sequences of characters. ✔
- Doesn't replace characters with others based on no evidence that they are similar to the intended character. ✔
For point 1. I just don't see how this is related to our discussion. " In automatic romanizing working progress"Hollow Wings wrote:
1. In automatic romanizing working progress, there're two ways for Chinese Romanisation:
a. semi-automatic romanisation from Chinese words separated by following proper rules.
b. automatic romanisation from Chinese characters one by one.
2. During this period of time, most of other countries aside of PRC can't fully accept that romanizing Chinese characters into separated words according to combinations between Chinese characters, because the works of finding and dealing with the concept of Chinese words are complex, also the grammar of Chinese sentence can even blur it.
after thousand of thoughts, they decide to do the romanization work from Chinese characters one by one.
Read more about ideograms here. These are logograms. Modern Chinese characters are logographic.Hollow Wings wrote:
and NO MORE.
- Egyptian hieroglyphs (eg. Ancient Egyptian) ←already dead
- Cuneiform script (eg. Ancient Sumerian) ←already dead
- Seal hieroglyphs (eg. Ancient Indian) ←already dead
- Maya hieroglyphs (eg. Ancient Mayan) ←already dead
- Chinese characters (eg. Chinese)
if you want to know why language system is like that, then that's a long story, i wont start telling them here.
the reason i pick up those truth above, is because i want you guys know the chinese language's specificity and leading to how different romanisation is done between alphabetic language and ideographic language.
This is not exactly true. If it were Mandarin would have been dead a long while ago because the only way to communicate would be carrying a crap ton of paper with you at all time and write stuff when you want to communicate.Hollow Wings wrote:
however, this is not reversible.
Don't you see that you are in the wrong topic.abraker wrote:
Any thoughts about mapping style or patterns the maps have being in tags?
1. if osu community don't use automatic or semi-automatic working progress, it'll be manual. and i just told you all things about why that progress is complex.CrystilonZ wrote:
For point 1. I just don't see how this is related to our discussion. " In automatic romanizing working progress"
omg... you even don't have any channel or way to read that document? then you may not know lots of concepts it mentions.CrystilonZ wrote:
2. Can you quote the exact words from the document? also all the reasons as stated in the standard as well. I couldn't read it while working on the proposal because 115 swiss franc is hella expensive.
seems like you are really obsessed with concepts, maybe it's my bad to simplify those things.CrystilonZ wrote:
Read more about ideograms here. These are logograms. Modern Chinese characters are logographic.
A number of lines after this are about pinyin being a method of transcription. No comments there this is acknowledged since the beginning that this is just the way to pronounce stuff. And the next few lines are about Mandarin having a lot of homophones.
lolCrystilonZ wrote:
This is not exactly true. If it were Mandarin would have been dead a long while ago because the only way to communicate would be carrying a crap ton of paper with you at all time and write stuff when you want to communicate.Hollow Wings wrote:
however, this is not reversible.
In English context it would be equivalent to you guys seeing or hearing /tīm/ (IPA stuff. This reads time). Intuitively the first thing that come into your heads would be the time. Tick-tock clocky stuff. However under different contexts:
"Can you buy me some /tīm/. I'm going to use it to cook dinner." In this case /tīm/ is the herb thyme.
"I don't have enough /tīm/ to do my homework. It's due tomorrow." In this case it's "time"
"Two /tīm/ two equal four." In this context it means multiply. 10/10 grammar.
As you can see they are reversible with context. And when you guys speak to each other you're actively tracing back to the original Hanzi characters using their pronunciation. Therefore, saying that it is not reversible is not true. It's harder in Mandarin (410 syllables - crap tons of words. Do the maths) but the fact that there are people speaking Mandarin proves the fact that it's possible.
conservative = a person who favors maintenance of the status quoSo, first of all, I'm quite surprised you are even trying to prove how Chinese is not logogram language. One says pictogram, one says ideogram, one says ideophonograph. I feel like you're trying to defend this so much that you have to take every single thing we said (even out of context, by the way) and simply make up something and say "do your research", while completely ignoring what we said. You, as Chinese have no priority in this matter, just because it's about Chinese, so stop acting like you are the better one and we know nothing because we didn't read document X, which nobody's even provided us before (and one of you in particular accusing others of not reading it, while misunderstanding it). Today's Chinese language is using only logograms. Origin of many of these logograms is pictographic or ideographic. You have to realise that something having pictographic/... features doesn't mean it's not logogram. I think you should know it, if you want to use it as an argument. What Hollow Wings said about this, by the way, is exactly proving that CrystilonZ was completely right on the fact that Chinese is logogram language, it's just that HW didn't understand the concept mentioned in the ISO file.
Fycho wrote:
Below are that I disagree: (blue are my replies)tl;nr: Current romanisation of Chinese is a fair enough way, I don't think it needs to be revised.Wafu wrote:
- Current system
- Titles are easy read ✘ (most of people will read every syllable as if it was one word)
- Titles are easy to remember ✘ (words are easier to remember than separate syllables, humans remember the words easier by their shape)
- Fits the rules of Latin script (Romanisation = writing words from other script to Latin/Roman script) ✘ (Latin script is alphabetical, therefore separating each syllable doesn't make sense and doesn't read well for majority of Latin script)
- Fits the rules of Chinese script ✘ (Impossible, if you want to make it "fit" to the Chinese script, you would have to replace each character with one logogram, Latin alphabet doesn't have logograms. Chinese is also not syllabary script, so separating each syllable again doesn't make sense.)
- Differentiates between different Romanisations and meanings of the same sequences of characters. ✘
- Includes tones in Romanised text ✘ (Impossible with characters which we are limited to. You could use "a1", "a2" (redundant) etc., but that would make the text incomprehensible, majority of people wouldn't even know how to pronounce it)
- Doesn't replace characters with others which have no evidence of being similar to the intended character. ✘ (ü is replaced with v, which doesn't seem to be supported by any logical argument)
- You can use a different Romanisation system for dialects where the current system wouldn't work at all ✘
- Isn't related to politics ✘ (Impossible, picking any Romanisation system is picking a side, every Romanisation system is related to politics)
- Proposal
- Titles are easy read ✔ sorry but as a Chinese speaker I don't think they are not easier to read than current seperated romanisation format
- Titles are easy to remember ✔ They are not easier to remember than current seperated romanisation from a Chinese speaker side as well
- Fits the rules of Latin script (Romanisation = writing words from other script to Latin/Roman script) ✔Lantin script isn't fittable to Chinese, Pinyin system is much better from my side too
- Fits the rules of Chinese script ✘ (Impossible, if you want to make it "fit" to the Chinese script, you would have to replace each character with one logogram, Latin alphabet doesn't have logograms.)
- Differentiates between different Romanisations and meanings of the same sequences of characters. ✔explained above, for Chinese characters, Pinyin or any other romanisation system doesn't have any meaning at all, it's just a way to use as mark for Mandarin
- Includes tones in Romanised text ✘ (Impossible with characters which we are limited to. You could use "a1", "a2" (redundant) etc., but that would make the text incomprehensible, majority of people wouldn't even know how to pronounce it)
- Doesn't replace characters with others based on no evidence that they are similar to the intended character. ✔ as the pinyin or anyother romanisation/lantin letters (they are just used as mark as I said)doesn't stand for meaning, this feels unnecessary
- You can use a different Romanisation system for dialects where the current system wouldn't work at all ✔ Dialects don't have an official way of writing formally, even in HK, schools teach the Mandarin grammar and write standard Chinese grammar while only use cantonese as a pronuciation. There wouldn't be songs that use dialects as song titile and artist, so they don't need to be romanised at any case.
- Isn't related to politics ✘ (Impossible, picking any Romanisation system is picking a side, every Romanisation system is related to politics)
AHollow Wings wrote:
OK, what a mess.
just wanna warning: my post will be long.
check it as detail as you can to know about Chinese language and its romanisation, if you wanna get involved into this.
A. Important things about Chinese Romanisation
I. "ISO 7098:2015".
1st thing of all, know things about ISO 7098:2015 as much as you can.ISO 7098:2015 explains the principles of the Romanization of Modern Chinese Putonghua (Mandarin Chinese), the official language of the People's Republic of China as defined in the Directives for the Promotion of Putonghua, promulgated on 1956-02-06 by the State Council of China. This International Standard can be applied in documentation of bibliographies, catalogues, indices, toponymic lists, etc.all contents in this document are important, you may know some before. and there's two parts i wanna specially mention for you, they are like:
1. In automatic romanizing working progress, there're two ways for Chinese Romanisation:
a. semi-automatic romanisation from Chinese words separated by following proper rules.
b. automatic romanisation from Chinese characters one by one.
2. During this period of time, most of other countries aside of PRC can't fully accept that romanizing Chinese characters into separated words according to combinations between Chinese characters, because the works of finding and dealing with the concept of Chinese words are complex, also the grammar of Chinese sentence can even blur it.
after thousand of thoughts, they decide to do the romanization work from Chinese characters one by one.
↑ this is my opening, just mark it and go on.
II. How special Chinese is as a kind of language.
according to the way characters comprise words, languages can be divided into alphabetic language and ideographic language, with alphabet and ideogram as their own characters.
a. alphabetic language is simple, most of you can easily know its concept. also, most of languages exist now, are alphabetic language. they are comprised with proper alphabet of their own. as i known:b. ideographic language is like, every single character was born from some exact thing or matter, this is very different from alphabetic language.... and tons of other alphabetic languages which may not be widely used or just dead.
- Cyrillic alphabet (eg. Russian)
- Hebrew characters (eg. Hebrew)
- Arabic alphabet (eg. Arabic)
- Armenian character (eg. Armenian)
- Georgian character (eg. Georgian)
- Old Geez abjad (eg. Old Geez) ←already dead
- Devanagari script (eg. Sanskrit)
- Tamil alphabet (eg. Tamil)
- Kana script (eg. Japanese)
- Hangul script (eg. Korean)
- Thai script (eg. Thai)
- Tibetan script (eg. Tibetan)
- Mongolian script (eg. Mongolian)
however, as i known, language that is ideographic language are:if you want to know why language system is like that, then that's a long story, i wont start telling them here.and NO MORE.
- Egyptian hieroglyphs (eg. Ancient Egyptian) ←already dead
- Cuneiform script (eg. Ancient Sumerian) ←already dead
- Seal hieroglyphs (eg. Ancient Indian) ←already dead
- Maya hieroglyphs (eg. Ancient Mayan) ←already dead
- Chinese characters (eg. Chinese)
the reason i pick up those truth above, is because i want you guys know the chinese language's specificity and leading to how different romanisation is done between alphabetic language and ideographic language.
III. “Transliteration” and “Transcription”
(Wafu: I have to shorten this because of character limit.)
B. Relation to osu community nomination system↑i don't know if CrystilonZ know the whole Chinese language family clear enough, so i'll add some additional things as basic background knowledge here.CrystilonZ wrote:
Other languages that use the Chinese script are irrelevant to this proposal.
We are only talking about Standard Mandarin here and Mandarin is not equivalent to Chinese.
We only use 'Chinese' in the draft for simplicity. The wording will be changed if this is implemented.ISO 639 code setsok, so, things above are just for electric area. there're still lots of other native language in PRC.
Documentation for ISO 639 identifier: zho
Identifier: zho
Name: Chinese
Status: Active
Code sets: 639-2/T and 639-3
Equivalents: 639-1: zh
639-2/B: chi
Scope: Macrolanguage
Type: Living
Denotation: See corresponding entry in Ethnologue.
The individual languages within this macrolanguage are
- Gan Chinese [gan] → 赣语
- Hakka Chinese [hak] → 客家话
- Huizhou Chinese [czh] → 惠州话
- Jinyu Chinese [cjy] → 晋语
- Literary Chinese [lzh] → 文言文
- Mandarin Chinese [cmn] → 官话(普通话)
- Min Bei Chinese [mnp] → 闽北话
- Min Dong Chinese [cdo] → 闽东话
- Min Nan Chinese [nan] → 闽南话
- Min Zhong Chinese [czo] → 闽中话
- Pu-Xian Chinese [cpx] → 莆仙话
- Wu Chinese [wuu] → 吴语
- Xiang Chinese [hsn] → 湘语
- Yue Chinese [yue] → 粤语
and i just don't post PRC's official native language list here, in case make things more complex.
since people like CrystilonZ may insist that Mandarin Chinese is the main target and other Chinese systems have none business with it, let's start from the concept level of "macrolanguage":
it actually has a property of "same standard pronunciation and style of writing".
and to Chinease as the macrolanguage, its standard, is just Mandarin Chinese.
so the truth is, all Chinese language families DO has a common standard, and also with hundreds and thousands of connection to it. when you are talking about some other Chinese family menbers, it always be effected by Mandarin system, which is the exact center of the whole topic.
if you wanna get rid of every other Chinese language families, then you need to give another complete romanisation rule, to solve some problems may happened in transcription process. otherwise, Mandarin Chinese's is automatically an official solving way. in case of that, be shall be care about this one's effection to other Chinese language families.
and also, the so called "Cantonese" is actually a concept of "languages spoken in Guangdong Province“, contained "Min Zhong Chinese", "Hakka Chinese“ and "Yue Chinese". people just usually use its narrow sense of concept: almost regard "Cantonese" as "Yue Chinese".
what's more, native language spoken in Taiwan is a kind of Min Nan Chinese, in case some ignorant one jumps out.
with all those knowledges above, we can move on:
I. How to deal with Mandarin Chinese transcription with words from other Chinese language families, but also already became a part of it?
1. Chinese archaism
it's a part of Literary Chinese, but also become a part of Mandarin Chinese.
some of them even changed meaning, and it's hard to distinguish.
if Literary Chinese is regarded as another individual language aside of Mandarin Chinese, then when meet words like "空穴来风", "闭门造车", "人尽可夫", etc, how to deal with these?
2. multi-Chinese based songs
for example, there's a Chinese song called "好心分手", one of its version is sang by both Yue Chinese and Mandarin Chinese.
so Yue Chinese romanized version is "Hou Sam Fan Sau/Housam Fansou" (actually this is jupting, a special kind of pinyin)
and Mandarin Chinese romanized version is "Hao Xin Fen Shou/Haoxin Fenshou".
both of them are spoken exactly correct, then how to deal with these?
3. with Chinese families that no romanisation rules supported
for example, there's a Chinese song called "外滩18号", which is sang by three kind of Chinese language: Mandarin Chinese, Wu Chinese and "Southwestern Hakka" (an official native Chinese language of PRC).
so it can be romanized like:
Mandarin Chinese: "Wai Tan Shi Ba Hao/Waitan Shibahao"
Wu Chinese: "Nga Thae Tze Ba O/Ngathae Tzebao"
Southwestern Hakka Chinese: "Vai Tan Si Ba Hao/Vaitan Sibahao"
i'm not sure if those ones are correct (just typed here with searching dictionary of native romanisation) aside of Mandarin ones, but it can still have chance to have the romanisation of their own part, right?
then how to deal with these?
II. Even if we shall transcript Mandarin Chinese from separated words into Latin characters, who is the one help those mappers mapping a Chinese song?
it has some part:you may think most of Chinese words may not complex like that, but if you wanna build a reasonable system for rules, it should be strict.
- is this a Mandarin Chinese song?
- maybe from official settings or sites, not a big deal. but will not do if you map some cult song.- how to get the right romanized characters?
- ask some Chinese staff/mapper/player? i doult any of them have time/ability to do it.- how to make sure those things i got is correct?
- some kind of same as the one above, if that person exsist and can do his job endlessly, he will be really welcomed to this system.
and it's not you become the person who do this kind of work, you can hardly imagine if it's hard to do it or not.
C. Summary
I. Opinions
1. even international level groups can't do lots of romanisation for Mandarin-Latin transcription from separated words.
it's feasible, for it's truth. but it's efficiency is really really badly low.
Chinese staffs will be weary/tired out to death if they really do this. because as you see what i've explained, it's a tough work with a tough progress to do.
also i even can predict that someone wanna find a right answer of correct Mandarin romanisaton for month, and still dqed after he found the answer he got is still wrong. then it may block people mapping Chinese songs, personally i think that's really a bad news.
2. Mandarin Chinese and Cantonese has standard romanisation rules, but not other Chinese families. it's hard to complete one of you don't care all of them, for every single one of them has a common standard pronunciation and style of writing: Mandarin Chinese.
in case of that, rebuilding the Mandarin Chinese romanisation system in to a better and complete one will be a really hard work to do, and it's for sure out of osu community's range.
3. Chinese osu community already argued this for several times long time ago, and the result is still: keep the current state.
II. Conclusion
do romanisation from one by one Mandarin Chinese characters is the best way SO FAR.
until we find some genius invent a dictionary of Mandarin-Chinese-characters-Latin-characters romanisation, and upgrade the efficiency a lot more than current one.
and also, this is the exact thing what international groups do right now. (they only combine proper nouns like people's or place's name, etc.)
--------------
simple extra p.s. here:to CrystilonZ, and other people who know little things about Chinese:
i think you had some wrong idea about Chinese characters, for i've seen written these:Chinese is far different from Japanese. the syllable thing you are talking about may be just the differences between Japanese's Hiragana or Katagana, but not that true for Kanji part.CrystilonZ wrote:
Similar to Japanese, one Chinese character does represent one single syllable. However, a word is not necessarily comprised of one syllable (like Japanese, Chinese is a polysyllabic language).For example 图书馆 (túshūguǎn) as a whole means library, and writing 'li bra ry' would defeat the purpose of Romanisation by not resembling the structure of languages using the Roman alphabet.
(btw, you may already know that a part of Japanese language system is just the exact Chinese.)
and now after reading all things i wrote above, you may know Chinese is not only a kind of polysyllabic language, but also the only living ideographic language.
"图书馆" reads "tú shū guǎn" and means "library", true.
However, "图书" reads "tú shū" and means "library book" or just “(picture) book", you ever know that?
this is far different from that you can't separate an English word in most cases: but you DO can separate a Chinese word, because every single character of Chinese can be a word.
eg.
图→graph, graphic, or lots of other meanings;
书→book, writing, letter, or lots of other meanings;
馆→shop, embassy, galleries or any building that showing something it wants to.
so, the one-character-one-word method is a solid reasonable metod for Chinese romanisation.
with knowledge of these, hope you can restructure your idea about Chinese, for helping you understand previous romanisation part.
--------------
hope all of these things could help you know more about Chinese romanisation.
also if you have any confusion about anything above, you are always welcomed to ask.
Regraz wrote:
Regarding the Romanisation of Mandarin, I would like to post my comments here.
Firstly I would like to start with the following proposal:Proposed Rules wrote:
The ü vowel should be Romanised into u and all diacritical tone marks should be omitted because of the technical limitations resulting from the limited amount of characters allowed in the Romanised title/artist fields.Please understand first, if you want to change the current rule, namely from ü to u, you have to prove yourself FIRST u is a better choice than v, instead of announcing you are going to change it to “u” while asking us to provide a better choice. There are plenty of letters and characters could be chosen, why you chose u? Just because they look similar after omitting the you called “diacritical tone mark”? I don’t think that is a reliable reason for this change as only judging by visual appearance is pretty unprofessional when talking about romanization. Additionally, Fycho has already mentioned the potential mess that might result from changing v to u, indicating that this entry within the proposal is not only pros. Therefore, prior to this discussion, you should not simply saying “The ü vowel should be Romanised into u…” and explain this change only by why “ü” cannot be implemented by the current system due to technical difficulties but to explain why “u” is better than “v” with valid reason, ( “u” can be pronounced is not a valid reason: there are many characters that could be pronounced, like a e I o and some bi-characters like yu, which is mentioned by Fycho. All of them have pros and cons, why do you gave preference to u in thisdraft?), as well as how you are going to address potential problems if this “u” proposal is implemented.CrystilonZ wrote:
speaking about u and v here. v is just impossible to pronounce. I'm always open for a better alternative.
Again, if you would like to change the current criteria, try to form up solid reasons and show people why your proposal is better than the current. Saying “I am going to change this into that, if you don’t have better choices then this will be the new criteria.” sounds pretty irrelevant, illogic, and showing kind of manipulation toward criteria about Romanization of Mandarin.
I would like to proceed to comparison between current and proposed system in the previous discussion:I don’t think with the proposal, titles are easier to read and remember.Previous Discussion wrote:
- Current system
- Titles are easy read ✘ (most of people will read every syllable as if it was one word)
- Titles are easy to remember ✘ (words are easier to remember than separate syllables, humans remember the words easier by their shape)
- Proposal
- Titles are easy read ✔
- Titles are easy to remember ✔
How do you expect speakers who don’t know how to pronounce “ü“, “v” and “u” to differentiate syllables and words under Romanisation of Mandarin?
For non-Mandarin speakers, there are no differences regarding readability between “Wo De Wei Lai Shi” and “Wo de Weilaishi” or any other combinations like “Wode Weilai Shi”. They have no idea what is a syllable and what is a word. If you think words are easier to remember (you did not post any proof or research regarding this either), why can’t a player treat the syllables as words? Now that the player have no idea what you are reading is word or syllable. There are less syllables than words in total, they should be more easier to read and memorize!
Fycho wrote:
The main arguments are listed below:For the first point, I recommend everybody has a read about ISO7098:2015 before sharing opinions, the romanisation of Chinese is much complex than others, which needs a lot of professional knowledges about Chinese. The new proposal can't stand “a word or phrase with double or more meanings”. For example, specific examples like "他谁都打不过", it's used intentionally to represent two meanings that are "Nobody can beat him" and "He can beat everybody", "Ta / Shui / Dou Da Bu Guo" and "Ta / Shui Dou Da Bu Guo". And it wouldn't be easier to be remember / read to Chinese / non-Chinese speakers. I am not going for detail, as someone would like to give more professional explanations.
- If we romanise Chinese title in word-by-word way(each character must be romanised into a single, capitalised, separated word) or generally every word should be separated and capitalised according to The Basic Rules of the Chinese Phonetic Alphabet Orthography.
- If using "yu" or "u" for the romanisation of the vowel "ü".
- If we need to distinguish dialects from Mandarin in romanisation.
For the second point, currently, "v" stands a lot. "ü" is one-word vowel, it works differently in pronunciation from two-words-vowel like "iu", "an", "ie", "üe", "ai", "ao", etc... We use "YU" for "ü" only in passport and other specific cases, because the passport require a captial letter about the name and "ü" doesn't have captial case. In other ways, there are still "v". For one-word vowel, "v" is the most common and familiar letter and it's officially supprted, and that is what the input keyboard uses in majority. I believe using "yu" for "ü" only makes it easier to read than "v" for non-Chinese speakers, but it's technically wrong, there aren't any other beneficial cases. The "u" of syllable "yu" is vowel "ü" actually and technically, but for "j / q / x / y / w", we use "u" for "ü", but it doesn't mean "u" can completely stand for "ü", and don't mean it's "yu" can stand for "ü", "y" isn't a vowel in Pinyin system at all, "y" is a consonant that has the same pronunciation as vowel "i", meanwhile "iu" and "yu" are completely two different things. In the pinyin system, "vowels" couldn't be made from "consonant". That means, By no means could "yu" become a two-word vowel, and could "yu" be used for romanisation which disobeying the language systems totally. "v" works best at the moment.
For the third point, is it necessary to distinguish dialects from Mandarin in romanisation. As all of us know, dialects are different in pronunciation, and some have different grammars. However, all the dialects don't have an official written format, and all the dialects do have a relation with Mandarin. A lot of Chinese characters words that are stand by all the dialects, like "好心分手", you can't know if it's Mandarin or Wu-Chinese or Cantonese unless someone pronounces it, but officially and technically we can't differ and figure out what it is, and it's just modern standard Chinese, and we romanise it in a standard way. Personally, I am a dialect-used person, and I can speak Wu-Chinese and Mandarin well. The major issue is there aren't any official way that we can write the dialect. This is because, It's not like the Japanese dialects, Japanese (Hirakana, Katakana) are same as lantin scripts, which are phonograms, however Chinese characters are ideographic and ideogram, this mades Chinese characters can't be used to represent the pronunciation to dialects, and decides that there wouldn't be any officially written form dialects, and there wouldn't be any song title that writes as dialects. There aren't any official published ways to romanise the pronuciation of dialects. Therfore it's unnecessary to distinguish dialects from Mandarin. By the way, if you are likely to say cantonese(Yue-Chinese), there isn't any official written form for cantonese as well, and in HK and Macau, the school teaches the standard Chinese written form, people personally like to type Yue-Characters in cantonese, which is more like a culture. It's not taught by the school officially. Enforcing something unofficial just makes us end up with endless discussions, that's why there isn't any official romanisation way until now, because we have already argued a lot in the real world, and haven't come out a conclusion. How can we romanise an independent language that even doesn't have a written format? I believe this is beyond out of the osu! community, and it's unnecessary to figure them out at the moment.
I've asked some Chinese-spoken QAT/GMT (Nardoxyribonucleic, spboxer3 and Zero__wind) for opinions about the proposal, and all of them think it's not necessary to revise the current romanisation rules about Chinese.
Hollow Wings wrote:
(maybe i'm not attentive enough... )
------------------------1. if osu community don't use automatic or semi-automatic working progress, it'll be manual. and i just told you all things about why that progress is complex.CrystilonZ wrote:
For point 1. I just don't see how this is related to our discussion. " In automatic romanizing working progress"
2. other alphabetic languages can be romanized automatically. so that's what osu community is doing.
→
you gonna let that complex work be done by Chinese osu staffs in manual? (you are not Chinese so you won't the one do it anyway.
i prefer just get rid of that and keep what we have: automanically romanisation with one by one words.
------------------------omg... you even don't have any channel or way to read that document? then you may not know lots of concepts it mentions.CrystilonZ wrote:
2. Can you quote the exact words from the document? also all the reasons as stated in the standard as well. I couldn't read it while working on the proposal because 115 swiss franc is hella expensive.
and if you don't read it, then you even didn't pass my previous post's precondition, that's bad news to me.
i still recommend you try hard to find a way to read that document.
so as you just so strict about that, i'll paste some part of that document. (but since it had copyright, i just paste text here but not original pictures.)
ISO7098:2015 said:12 Automatic transcription for named entitiessince you didn't read that document, i just wanna say that:
In the comuputer-assisted documentation, there are two approaches to automatic transcription for named entities, namely:
- fully automatics syllable transcription;
- rule-based and semi-automatic word transcription.
the main part of the document are just discussions about how to transcript proper nouns (or just "names") of places and persons.
that's what i summed up for that in previous post about ISO7098:2015.
and i emphasize this again: at international level, most of Chinese words are still transcripted into one by one characters in Latin characters of pinyin.
ISO7098:2015 just made a small step: make proper names combined.
the romanisation of Chinese in ISO is far more uncompleted.
i don't think osu commutity can do what ISO wasn't able to do.
------------------------seems like you are really obsessed with concepts, maybe it's my bad to simplify those things.CrystilonZ wrote:
Read more about ideograms here. These are logograms. Modern Chinese characters are logographic.
A number of lines after this are about pinyin being a method of transcription. No comments there this is acknowledged since the beginning that this is just the way to pronounce stuff. And the next few lines are about Mandarin having a lot of homophones.
then let me explain clearly: the ideogram i called Chinese character, is one of its property, like other ancient ones.
so called "logograms" is not Modern Chinese character's exact definition. let's see what ISO7098:2015 showed:
ISO7098:2015 said:2.6just mention: hanzi (Chinese), kanji (Japanese), and hanja (Korean) reads similar right? they all came from the common source: Chinese characters (汉字). and that is the exact "Chinese character" i pointed out at my prevous post as a ideogram.
ideophonographical character
graphic character (2.6) that represents an object or a concept and is associated with a sound element in a natrual language.
EXAMPLE Chinese hanzi 鹤(crane), Japanese kanji 戦(war) and Korean hanja 册(book) are ideophonographical characters.
and addtional knowledges here: you may know that, at the VERY FIRST, alphabetic characters are ideographic charcters as well. people comes later just get rid of their meanings and just use those characters as a tool to complete words, which didn't happen to Chinese.
(like you saw a character "m" and you may see nothingor you can see everything, that's not what Chinese characters do.
now we are clear to compromise with concepts: the Modern Chinese character is a kind of ideophonographic character.
(and also you may know that both ancient Chinese character and alphabetic character are ideographic character.)
------------------------lolCrystilonZ wrote:
This is not exactly true. If it were Mandarin would have been dead a long while ago because the only way to communicate would be carrying a crap ton of paper with you at all time and write stuff when you want to communicate.
In English context it would be equivalent to you guys seeing or hearing /tīm/ (IPA stuff. This reads time). Intuitively the first thing that come into your heads would be the time. Tick-tock clocky stuff. However under different contexts:
"Can you buy me some /tīm/. I'm going to use it to cook dinner." In this case /tīm/ is the herb thyme.
"I don't have enough /tīm/ to do my homework. It's due tomorrow." In this case it's "time"
"Two /tīm/ two equal four." In this context it means multiply. 10/10 grammar.
As you can see they are reversible with context. And when you guys speak to each other you're actively tracing back to the original Hanzi characters using their pronunciation. Therefore, saying that it is not reversible is not true. It's harder in Mandarin (410 syllables - crap tons of words. Do the maths) but the fact that there are people speaking Mandarin proves the fact that it's possible.
NONSENSE.
i think you still don't have enough cognition about how Chinese words and sentences can become.
again, you CAN'T simply know what those Chinese character exactly is, until you need to fully understand all of conponents both in and out of it.
if you just get the sentence without any other notice, you will never be able to do that, which means that sentence's meaning is various.
here are some examples:a. one best example here, which shows that if you make mistake with it, you may got big trouble.this is what general phenomenon in Chinese language environment and its romanisation like.
"Gu Niang, Shui Jiao Yi Wan Duo Shao Qian?"
this sentence mainly has two meanings:
1. "Hey gril, how much it costs if i buy a bowl of your dumplings?" (姑娘,水饺一碗多少钱?)
2. "Hey gril, how much it costs if i sleep you one night?" (姑娘,睡觉一晚多少钱?)
this widely happens in electric alphabet systems without tones, just like osu system.
b. a more common one here.
"Jie Dao Shou Zhang Zai He Shang"
this is too complex, i just do some transcription, and you may just do your mathematics mapping and see if you can figure out all of that sentence may means:
1. "Jie Dao" → 接到(catch/catch up/get/take/etc.), 街道(street/road/way/etc.), etc.
2. "Dao Shou" → 到手(already get sth./reach your hands/etc.), 倒手(transfer things between hands/buy in&out/left hand/etc.), etc.
3. "Shou Zhang" → 手掌(palm/people you trust/etc.), 首长(boss/highest level person/etc.), 收账(charge/blackmail/etc.), etc
...
oh hell, i won't continue.
this also happens even you have words separated:
"Jiedao Shouzhang Zai Heshang"
↑ maybe try your best to figure out what this means, and i can predict that you may find out at least 4 of meanings.
that's why i'm always saying why it's complex:
alphabetic characters can be transliterated immediately, even if you don't know what that word means.
and this won't work to ideophonographic characters, expecially Chinese characters.
and that's also the detail part of why it's not reversible.
c. some special meme here.
"Shi Shi Shi Shi Shi"
non-Chinese speakers may have no idea what's this.
but it's a popular article called "施氏食狮史" which is a best example to show how hard it may effect us to just read Chinese with only pinyin (or romanized Latin characters).
if you insist your opinion then try to figure out what this sentence means:
"Ji Ji Ji Ji Ji"
just mention: that's also a wonderful article in Chinese writing.
this is just one form of Chinese meme, there're tons of others in Modern Chinese.
like "爷爷", "不星", etc.
Latin-Chinese transcript is not reversible, is the exact truth.
------------------------
if you think separated words of pinyin in Latin characters as the romanisation of Chinese characters is better,
then you are wrong.
as a pure system, it's better ofc.
because it helps non-Chinese people read and understand.
and actually it's the very last of goal Chinese romanisation want to reach.
but in all of other sides, it sucks.
1. automanical works can't be done, so it need manual ones, which is tough and complex. you are not the one do it, so you won't understand.
2. you still need deep knowledge to know "what is a Chinese word" before you want to search some. that's just worse because it's harder.
3. osu staffs are not language specialists. they are the best at mapping or mapping checking works, but not at language area.
4. etc. (too much and just stop here
------------------------
the standard of Chinese romanisation is not even build up, "The Basic Rules of the Chinese Phonetic Alphabet Orthography" is just a tool to show rules we have a way do it, it doesn't mean we can really do it.
it's also why we call transcription for separated words of Chinese romanized words is "semi-automatic", becasue part of it is still manual, and will always be manual for a long time.
unless our AI tech is upgraded to a really high level that it can analyse that complex Chinese sentence, and do the rest just what alphabets languages' transliteration had already done. (or maybe you can just know some Chinese language specialists and pay them to do this work.
thou i've told you truths about how complex the separating work for Chinese words, here's some other ISO document conponents:
ISO7098:2015 said:10.7 At present, in Chinese linguistics, there is no clear common definition of a Chinese word yet, so it is difficult to decide the boundary (dividing line) of a common Chinese word sometime, and, of course, it poses difficulty to link the monosyllables to form a common polysyllabic Chinese word.sure metadata of osu maps is important, but this topic is far from what osu community could do.
waiting for next progress then.
Maybe Wafu did a provocative (but bad?) try to labelize, defame, calumniate and libel others? However, from his PM to me, it seems Wafu himself even failed to keep his words civilized. I will attach the screenshot of that forum pm here for everyone to read.Wafu wrote:
…and without Regraz's attempts to make fun of someone
Sadly, arrogantly defining Chinese as a logographic language is a pure fallacy. In formal Chinese writing today, Chinese characters are input by keyboards in a syllabic way, and it is the dominant method of inputting Chinese (and its Romanization as well) in osu! amongst players. Under the mixed impact of other languages and the development of currently dominant input method of Chinese in Internet, especially when it comes to romanization, "you called" logographic characteristics of Chinese is increasingly ambiguous. Your statement is already groundless and archaic because you still stay with writing Chinese characters in paper, instead of considering the input method with keyboards, which is syllable-oriented and in fact supports the current syllable-based metadata (Romanization) scheme.Wafu wrote:
Chinese is a logographic language. It's not pictographic nor ideographic language. They use a some characters with pictographic/ideographic features, but that doesn't make Chinese a pictographic/ideographic language. You guys don't even know your own language. Wake the fuck up.
I hold respect toward the entire UBKRC team as they comprise the most experienced people about criteria elaboration and modification but this just makes me quite disappointed. It is really a lackluster, and even a blemish.osu! Rules wrote:
Be productive with your criticism without resorting to personal attacks. Criticism is a wonderful thing when done properly, but if you're resorting to personal attacks to make your point, you're doing it wrong and you should feel bad.
I believe what Regraz is trying to say here is about inputting Chinese characters using Latin keyboards. To type Hanzi characters you simply write the pinyin for them and because of there are a lot of characters with the same pronunciation, usually there will be a pop-up list like this for you to choose the characters fromRegraz wrote:
First, In formal Chinese writing, there is no logograms as well.
We've expressed (thoroughly I believe) what problems the current system has. Please read all the previous points made in this discussion.Shad0w1and wrote:
So let's face the reality, there isn't a standard for Chinese romanization into ANSI code. I can't understand that without a commonly accepted standard, why would you guys try to change the current metadata rule?
This paragraph is quite illogic.CrystilonZ wrote:
The thing is this is just a way to input Hanzi characters into an electronic system. In the example above wo men de ai is never supposed to be the final result. The way you input stuff is not at all related to how you Romanise things.
This paragraph is even more illogic than that above. I think it is you who have quite a few confusions toward Chinese/Mandarin and the characters.CrystilonZ wrote:
I'd like to explain about Chinese characters being logographic as well.
You guys seem to have a little confusion about how characters are formed and how they function nowadays.
Some characters like 月 originally looked like the crescent moon. These are said to be characters with pictographic origin.
Some characters like 上 are created by trying to convey a concept, which in this case is up on above w/e. These are said to be characters with ideographic origin.
There are more ways that Hanzi characters are created but I'll not go there since they are not really related to the topic atm.
However, these are how characters are created not how they function. It's really really important to keep this in mind. As I can see this is where the misconception stems from.
Right now these characters function are to represent words or phrases. Therefore Mandarin is, by definition, a logographic language.
See? This standard is specific for copy preparation and proof correction. The standard has to classify languages into types for the sake of copy preparation and proof correction since in this standard, copy preparation and proof correction of alphabetical and logographic language differs. Moreover, I believe copy preparation and proof correction is quite digressing and deviant from the topic here.ISO wrote:
ISO 5776:2016 specifies symbols for use in copy preparation and proof correction in alphabetic languages and in logographic languages. It is applicable to texts submitted for correction, whatever their nature or presentation (manuscripts, typescripts, printer's proofs, etc.), and for marking up copy for all methods of composition.
Actually, more pertinent standards have already been provided by Hollow Wings above, however, you totally give no attention on them when posting stuffs here while bring up this standard. This is really not a good manner. And it failed to be a support to your statement.CrystilonZ wrote:
It's really really important to keep this in mind.
It is really interesting to read this: “I'd like to request everyone involving in this discussion to refrain from using condescending tone, sarcasm, personal insults and anything that can impede the process.”CrystilonZ wrote:
From now on I'd like to request everyone involving in this discussion to refrain from using condescending tone, sarcasm, personal insults and anything that can impede the process. Don't take every fucking thing personally. Read these things with three pieces of chocolate chip cookie and a cup of tea.
This is exactly what I want to say to you, though actually I would like to say: Please read all the previous points made carefully in this discussion. Rest of your misconceptions have been already explained by Hollow Wings and Fycho so, if you choose to ignore them and force your idea, then I have nothing to do with that. People do not want to explain over and over again as that is pretty time-wasting.CrystilonZ wrote:
Please read all the previous points made in this discussion.
i'm not defending you anything, i'm picking up international standard to show concepts that have officially confirmed, not what you're using as usual words or just some wikipedia instant knowledge.Wafu wrote:
So, first of all, I'm quite surprised you are even trying to prove how Chinese is not logogram language. One says pictogram, one says ideogram, one says ideophonograph. I feel like you're trying to defend this so much that you have to take every single thing we said (even out of context, by the way) and simply make up something and say "do your research", while completely ignoring what we said.
↑ and this the most hilarious thing i've saw this day.Wafu wrote:
You, as Chinese have no priority in this matter, just because it's about Chinese,
nonsense.Wafu wrote:
so stop acting like you are the better one and we know nothing because we didn't read document X, which nobody's even provided us before (and one of you in particular accusing others of not reading it, while misunderstanding it). Today's Chinese language is using only logograms. Origin of many of these logograms is pictographic or ideographic. You have to realise that something having pictographic/... features doesn't mean it's not logogram. I think you should know it, if you want to use it as an argument. What Hollow Wings said about this, by the way, is exactly proving that CrystilonZ was completely right on the fact that Chinese is logogram language, it's just that HW didn't understand the concept mentioned in the ISO file.
nonsense.Wafu wrote:
- 1. You have to remember that ISO being international doesn't mean that we have to follow ISO (After all we would be breaking many of their standards, even in the other Romanisation systems). There are many references and standards that are not ISO and are better in quality and design. It's not like taking one thing that benefits you is going to help. Regardless, as I've read this standard (and if you are concerned that I may not have done some research, this is a tiny part of what I've read to make my opinions on this issue, the research is much deeper than a single ISO document), I think it benefits you less than you think it does. In fact, from what arguments you are quoting, you don't seem to understand it very well on your own (or maybe you just expressed yourself poorly, but you misrepresent the document you are promoting here).
nonsense, and ignorant, again.Wafu wrote:
- 2. I already addressed this point in the previous post, but it was kinda taken out of the context. I already explained this in relation to Chinese script incompatibility. Anyway, what you're trying to say here is quite a non-sense. None of the listed languages are actually ideographic. All of these are logogram languages that partially use ideographic characters, but mostly characters that just originate from ideographic characters. The languages itself use logograms.
the reason i show you it's not reversible is mainly telling you that Chinese characters are really special in romanisation area.Wafu wrote:
- 3. This has been addressed already. There's no reason why the missing reversibility factor would impair Latin script users from reading or memorizing this (that, however, is impaired by the current system which is reversible). It only has impact on people who actually can use Chinese language, these people do have a solution. The original title/artist is still present, so they don't need to reverse it to logograms. For Latin script users, there's no reason why they would want to reverse the text.
This is the part that many of you have taken out of the context, and therefore misunderstood it. Again, Romanisation (internationally), is not designed for people who are fluent in Chinese, so there is no reason why they would want to convert it back to logograms. For those who are fluent in Chinese, you literally have it in the original title/artist.
"Transliteration won't do, we need to do "Transcription"" argument doesn't make much sense. Not only, as I explained above, it's really not needed in this scenario, but at the same time, you are saying this and want to support a system that omits the phonetics? That literally makes it transliteration.
"we chinese ourselves even cant understand what those words said in a short time, if they are all written in Latin characters of pinyin one by one" Don't make this up. If you can't understand Romanised text, it's because you can't process Latin script, again, there's the original title/artist for you to clarify it. You are not the primary target of the Romanisation. It's more important for a regular player to be able to memorize and read the title/artist, than for Chinese player to memorize, read and understand both the original and Romanised titles/artists.
1. how is that out of topic when all Chinese language families' standard is Mandarin Chinese, which CrystilonZ really wanna focus on building the system for?Wafu wrote:
B
- 1. Not worth talking about. This is a bit more out of topic and should already be clear from the previous discussion.
- 2. The same way we're doing it for Japanese. And we are doing it for Japanese. The "Metadata Heap" Discord server is quite a big one and people do solve issues here quite quickly and effectively. Even for the languages they don't know very well. You really underestimate this community if you think it will be only and only up to staff members. Sure, they have to recheck, but if this is discussed in the channel, they generally have a good starting point. So far, I haven't seen a problem that wasn't resolved there, it really shouldn't be that big of a drama that you make it look like (And yes, even QATs/GMTs are active here, but they don't do majority of the requests on their own). This is, therefore, not an issue.
1.1 i think you just don't get it at all, for you don't understand that no standards is identified. this is addressed lots of time.Wafu wrote:
C
- 1. 1. Same as for the the last point in B. This is not at all an issue. The second part of it, I don't think people take metadata DQs so negatively. I don't remember a single time it happened that people stopped mapping songs of certain language due to complicated metadata, even since metadata became more strict.
- 1. 2. Already addressed, this is not what should be discussed. Current system doesn't solve this even though you may imply it does. It doesn't.
- 1. 3. The result is "keep the current state" because of the conservative stance. From what I've seen, Chinese people argued for this system poorly and detached from the community that it's primarily about. Now the target community argues for something else, that doesn't mean you just keep conservative stance because we are not Chinese. The argument can never be that "it was discussed by Chinese community" because it's not only about you. We also don't say: "You can't judge most of the things because you don't know how Latin script languages work.", so don't do that to us.
- 2. Already explained in above paragraphs.
again, nonsense with ignorant, that you compare Chinese to Japanese. won't text more addressed thing here.Wafu wrote:
- 1. Again, nothing is going to be "up to Chinese staff", they don't even communicate about metadata much. It doesn't need to be automatic. Japanese is also not automatic. Your assumption that people who are not Chinese can't do Chinese metadata is incorrect. There are people who do it (e.g. in the aforementioned Discord server for metadata), some do it more reliably than Chinese people. There are many people who look up Japanese characters one by one and they are very accurate with it, even though you have to think of exceptions (there are even Chinese alternatives that are more accurate, btw.). This is not a problem again, unless you assume that where you are born determines what metadata you can do, which is not true.
because i think we shall talk about things at the international stage, so the standard of international level document is the basic prediction of our topic and discussion.Wafu wrote:
- 2. I don't understand why you would enforce some "preconditions" in your post. I understand you want people to do some research, which is okay, but remember it works vice-versa.
how ignorant that you even don't understand what Ancient Chinese is.Wafu wrote:
- 3. Ancient Chinese was not an ideogram language. You've also proven CrystilonZ right, you just used different terms.
nonsense with ignorance.Wafu wrote:
To your point about us being "wrong". You don't know what Romanisation is about. The idea that Chinese Romanisation's goal is not at all to help non-Chinese people is non-sense. We are not in China where you don't care. We are in osu!, which is an international game. Concept of Romanisation in China is different than Romanisation worldwide (target = people using Latin script, this is what we need to use, otherwise we wouldn't have Romanisation at all). Consider everyone equal rather than saying that someone doesn't understand how hard it is because they are not Chinese. 1. Yes, CrystilonZ could Romanise Chinese. 2. That doesn't say much, but no, you don't need to have extreme experience in Chinese to find correct artist name and song title. 3. I don't know where your information about staff's real life and education stems from. osu! is not a full-time job, so these people can very well be experienced in languages. And no, we don't need to pay language specialists, we never did pay anyone and even complicated metadata is being actively produced.
the biggest issue here now is:Wafu wrote:
Conclusion:
I think I addressed all the relevant points. I want to make it clear that if some parts sound harsh, that's not what I intended. I just want this discussion to be fair for everyone. (and without Regraz's attempts to make fun of someone)
I hope this makes some clearer. The biggest issue I see you guys didn't understand is how Romanisation works internationally, outside of China. That's probably what makes you all not realise that being able to convert Romanisation back to Chinese is not the primary intention (especially not in osu!, where the original Chinese titles will be visible)
Your false accusations (of us not reading stuff or not being professional) did, indeed, make me send you this message (and it is called exactly that: "Private message"). I'm not making fun of you as it was not public, you making it public doesn't mean I'm making fun of you. I wanted you to know that putting this down to "there's no research" was unfair of you, as you didn't invest your time into the research either. Was I being rude to you in the private message? Yes, as as you were when you clearly did, intentionally ridicule the proposal, except I at least could keep it private.Regraz wrote:
Regardless of the totally illogic post Wafu made above, I find it interesting and ironic that Wafu’s post here is in fact contradicting what Wafu sent me in forum private message.
I even could not help laughing when Wafu said:Maybe Wafu did a provocative (but bad?) try to labelize, defame, calumniate and libel others? However, from his PM to me, it seems Wafu himself even failed to keep his words civilized. I will attach the screenshot of that forum pm here for everyone to read.Wafu wrote:
…and without Regraz's attempts to make fun of someone
Who do you think is actually making fun of others?
Why am I arrogant for saying that Chinese is a logographic language? What you are saying is a fallacy, because you are changing the topic to relation of language to keyboards. Yes, logograms can be typed as syllables on a keyboard. That however doesn't change the class of the characters. This is because you can't fit all the characters on one keyboard. Language is logographic if it is using primarily logograms. Hanzi, by definition, are logograms, that is what makes Chinese logographic language. It's the characters that are logograms, that makes it logographic language, even though you write Chinese differently on the keyboard, it doesn't change definition of logogram, nor the fact that the resulting characters are logograms. If couldn't find an example of a Chinese character that is not logogram, I don't think that several standards, including ISO would get it wrong.Regraz wrote:
Sadly, arrogantly defining Chinese as a logographic language is a pure fallacy. In formal Chinese writing today, Chinese characters are input by keyboards in a syllabic way, and it is the dominant method of inputting Chinese (and its Romanization as well) in osu! amongst players. Under the mixed impact of other languages and the development of currently dominant input method of Chinese in Internet, especially when it comes to romanization, "you called" logographic characteristics of Chinese is increasingly ambiguous. Your statement is already groundless and archaic because you still stay with writing Chinese characters in paper, instead of considering the input method with keyboards, which is syllable-oriented and in fact supports the current syllable-based metadata (Romanization) scheme.
I said what it is about and what the intention is (even in the proposal posts). You just ignored the reasoning completely.Regraz wrote:
Learn the basics about metadata and Romanization. I do recommend knowing the basics about Mandarin as well since Romanization is a work requires knowledge about both ends, though I do not expect you to do this much because it seems you have no idea about what is Romanization.
Wafu wrote:
You, as Chinese have no priority in this matter, just because it's about Chinese,
draft wrote:
Cyrillic Romanisation: Use BGN/PCGN system for Russian/Cyrillic. Е and е should be romanised as ye if it stands alone or after a, e, ё, и, о, у, ы, э, ю, я, й, ъ, ь. In other cases, it should be romanised as e. ё should be romanised to ye, however, use yo or o to avoid usage of special characters. Ignore any other rules in the file provided, these are either irrelevant or wouldn't help in the game. If an artist uses a preferred romanisation, follow it regardless of this rule. For most of the other characters, refer to the first page of this document.
draft wrote:
Songs with Russian metadata must be romanised using the Cyrillic Romanisation method in romanised fields when there is no romanisation or translation information listed by a reputable source. The same applies to the Source field if a romanised Source is preferred by the mapper.
Russian Romanisation: Use BGN/PCGN system. Е and е should be romanised as ye if it stands alone or after a, e, ё, и, о, у, ы, э, ю, я, й, ъ, ь. In other cases, it should be romanised as e. ё should be romanised to ye, however, use yo or o to avoid usage of special characters. Ignore any other rules in the file provided, these are either irrelevant or wouldn't help in the game. If an artist uses a preferred romanisation, follow it regardless of this rule. For most of the other characters, refer to the first page of this document.
Songs with Russian metadata must be romanised using the Russian Romanisation method in romanised fields when there is no romanisation or translation information listed by a reputable source. The same applies to the Source field if a romanised Source is preferred by the mapper.As for other Cyrillic based languages I propose to leave it to case by case scenario because current amount of such sets are negligible.
Well.. we can separate Ukrainian and discuss details but due to extremely low amount of beatmaps and people involved I don't see this as productive work.Kurai wrote:
- Cyrillic Romanisation should follow the BGN/PCGN system (except for the letter ё in Russian which should follow the GOST 2002(B) system). Read more here: http://up.kuraip.net/032209ex3724.pdf
Firis Mistlud wrote:
Wafu wrote:
You, as Chinese have no priority in this matter, just because it's about Chinese, <--- True
Let me reword that:
"You, as Westerners have no priority in this matter, just because it's about Chinese" <--- Also true
NONSENSEHollow Wings wrote:
on the contrary, Chinese people have the exact highest priority in this matter, just because it's about Chinese.
This is for Regraz. Please allow me to demonstrate how Hollow Wings have been posting so far.Hollow Wings wrote:
lolCrystilonZ wrote:
This is not exactly true. If it were Mandarin would have been dead a long while ago because the only way to communicate would be carrying a crap ton of paper with you at all time and write stuff when you want to communicate.
In English context it would be equivalent to you guys seeing or hearing /tīm/ (IPA stuff. This reads time). Intuitively the first thing that come into your heads would be the time. Tick-tock clocky stuff. However under different contexts:
"Can you buy me some /tīm/. I'm going to use it to cook dinner." In this case /tīm/ is the herb thyme.
"I don't have enough /tīm/ to do my homework. It's due tomorrow." In this case it's "time"
"Two /tīm/ two equal four." In this context it means multiply. 10/10 grammar.
As you can see they are reversible with context. And when you guys speak to each other you're actively tracing back to the original Hanzi characters using their pronunciation. Therefore, saying that it is not reversible is not true. It's harder in Mandarin (410 syllables - crap tons of words. Do the maths) but the fact that there are people speaking Mandarin proves the fact that it's possible.
NONSENSE.
i think you still don't have enough cognition about how Chinese words and sentences can become.
again, you CAN'T simply know what those Chinese character exactly is, until you need to fully understand all of conponents both in and out of it.
if you just get the sentence without any other notice, you will never be able to do that, which means that sentence's meaning is various.
here are some examples:a. one best example here, which shows that if you make mistake with it, you may got big trouble.this is what general phenomenon in Chinese language environment and its romanisation like.
"Gu Niang, Shui Jiao Yi Wan Duo Shao Qian?"
this sentence mainly has two meanings:
1. "Hey gril, how much it costs if i buy a bowl of your dumplings?" (姑娘,水饺一碗多少钱?)
2. "Hey gril, how much it costs if i sleep you one night?" (姑娘,睡觉一晚多少钱?)
this widely happens in electric alphabet systems without tones, just like osu system.
b. a more common one here.
"Jie Dao Shou Zhang Zai He Shang"
this is too complex, i just do some transcription, and you may just do your mathematics mapping and see if you can figure out all of that sentence may means:
1. "Jie Dao" → 接到(catch/catch up/get/take/etc.), 街道(street/road/way/etc.), etc.
2. "Dao Shou" → 到手(already get sth./reach your hands/etc.), 倒手(transfer things between hands/buy in&out/left hand/etc.), etc.
3. "Shou Zhang" → 手掌(palm/people you trust/etc.), 首长(boss/highest level person/etc.), 收账(charge/blackmail/etc.), etc
...
oh hell, i won't continue.
this also happens even you have words separated:
"Jiedao Shouzhang Zai Heshang"
↑ maybe try your best to figure out what this means, and i can predict that you may find out at least 4 of meanings.
that's why i'm always saying why it's complex:
alphabetic characters can be transliterated immediately, even if you don't know what that word means.
and this won't work to ideophonographic characters, expecially Chinese characters.
and that's also the detail part of why it's not reversible.
c. some special meme here.
"Shi Shi Shi Shi Shi"
non-Chinese speakers may have no idea what's this.
but it's a popular article called "施氏食狮史" which is a best example to show how hard it may effect us to just read Chinese with only pinyin (or romanized Latin characters).
if you insist your opinion then try to figure out what this sentence means:
"Ji Ji Ji Ji Ji"
just mention: that's also a wonderful article in Chinese writing.
this is just one form of Chinese meme, there're tons of others in Modern Chinese.
like "爷爷", "不星", etc.
Latin-Chinese transcript is not reversible, is the exact truth.
Maybe people like Hollow Wings are not good at speaking English. Let me simplify this for you.CrystilonZ wrote:
As you can see they are reversible with context. And when you guys speak to each other you're actively tracing back to the original Hanzi characters using their pronunciation. Therefore, saying that it is not reversible is not true. It's harder in Mandarin (410 syllables - crap tons of words. Do the maths) but the fact that there are people speaking Mandarin proves the fact that it's possible.
Metadata covers tag guideline/rules. How am I in the wrong topic?Tofu1222 wrote:
Don't you see that you are in the wrong topic.abraker wrote:
Any thoughts about mapping style or patterns the maps have being in tags?
I don't see any restrictions for this right now as long as they are related to the set. Also don't think that this worth specific mentioning.abraker wrote:
Any thoughts about mapping style or patterns the maps have being in tags?
To your previous post, if we discuss language, we will use terms related to languages and linguistics. I can't avoid that.Mishima Yurara wrote:
it makes absolutely no sense that the chinese do not have the higher priority when talking about chinese.. it is Quite Literally the language that they speak AND they are also... Quite Literally... the most Affected by the proposal regarding chinese metadata.. not sure how the priority of a group of people is parallel to a nonreasonable discussion either..
Where's the basis for that? It does make it harder for the reasons mentioned already. In both the proposal and several of these posts. How do Chinese people read it easier, if the text doesn't change for them at all?Mishima Yurara wrote:
it really doesnt make it any harder to read the title/artist with separated syllables so i dont see the harm in staying consistent with other platforms alongside making it easier for chinese people to..... read their own language L