If you guys could summarize what rule you want to change/add, that would speed up the process here a lot. Something like
In the case of romanizing ü, use v, not yu. Or something like that. This is just an example btw. I disagree with using "v" as romanization since English speakers will pronounce "v" differently from how it's supposed to be pronounced in Mandarin.
I would like to add a few extra rules to the proposal, taking into account other languages that so far haven't been discussed, since everyone's been caught up on the Chinese romanization debate.
With respects to Korean romanization, I'm wondering if we should continue applying the McCune-Reischauer system for romanizing Korean. This is the system that the Library of Congress is using. Nyquill brought up an excellent point about using romanization systems that other large institutions are currently using and it works a lot better than creating our own modified system in most cases (unless we are simplifying).
I'm bringing this up because there is also the Revised Romanization of Hangeul system that was introduced on July 7th, 2000 which has been applied to various Korean road signs transportations etc... The major change of course being that the new system eliminates diacritics in favor of digraphs.
A possible rule would look like:
Songs with Korean metadata must be romanised using the McCune-Reischauer system for romanizing Korean when there is no romanisation or translation information listed by a reputable source. The same applies to the Source field if a romanised Source is preferred by the mapper.
Additionally, we could introduce the use of digraphs and two-vowel letters into the proposal:
Vowels /ʌ/ and ㅡ/ɯ/ should be written as digraphs in Korean romanization, and romanized to eo and eu respectively.
Another language to examine is
Thai. The Library of Congress recommends nine additional rules for Thai romanization which are:
Library of Congress wrote:
Romanization
1. Tonal marks are not romanized.
2. The symbol ฯ indicates omission and is shown in romanization by “ … ” the conventional sign for
ellipsis.
3. When the repeat symbol ๆ is used, the syllable is repeated in romanization.
4. The symbol ฯลฯ is romanized Ia.
5. Thai consonants are sometimes purely consonantal and sometimes followed by an inherent vowel
romanized o, a, or ǭ depending on the pronunciation as determined from an authoritative
dictionary, such as the Royal Institute's latest edition (1999).
6. Silent consonants, with their accompanying vowels, if any, are not romanized.
7. When the pronunciation requires one consonant to serve a double function – at the end of
one syllable and the beginning of the next – it is romanized twice according to the
respective values.
8. The numerals are: ๐ (0), ๑ (1), ๒ (2), ๓ (3), ๔ (4), ๕ (5), ๖ (6), ๗ (7), ๘ (8), and ๙ (9).
9. In Thai, words are not written separately. In romanization, however, text is divided into words
according to the guidelines provided in Word Division below.
My question for Thai romanization is whether we should treat them similarly to how we are treating Chinese romanization which is to separate words with spaces, or if we should clump them together,
for example: พระนางเจาพระบรมราชินีนาถ romanized as: Phranāng Čhao Phrabǭrommarāchinī Nāt or Phranāngchaophrabǭrommarāchinīnāt. Also, should we make all the separated words uppercase, or only the first? Since Thai chains everything together, there is no indicator for upper and lower case when we split the phrase up (if we do).
The two rules I am proposing are:
Songs with Thai metadata must be romanised using the Library of Congress system (also known as ISO 11940) for romanizing Thai when there is no romanisation or translation information listed by a reputable source. The same applies to the Source field if a romanised Source is preferred by the mapper.
and
In the romanization of Thai, words should be romanized separately, and separated by a space. Additionally, all words should (or should not?) be uppercased.
Attached are helpful transcription keys for Thai:
Another language that is becoming more and more relevant is Arabic, and there are some issues I would like to bring forth with regards to its romanization.
Here is the table for romanization of Arabic:
As you can see, some issues come up. In the romanization of ص ص ص ص for example, (whether initial, Medial, Final, or Alone) the romanization becomes " ṣ" however, the diacritical mark is not something that can be used by osu because it is still not unicode. I would like to propose that all of these diacritical "," attached to letters be removed for the sake of simplicity and because osu currently does not support them. Therefore something like " ص◌نضوِ◌خ" should be romanized as "sandwich".
Another problem with
Arabic is that it is typed in reverse, right to left. Should we also apply this to romanization? In this case "ص◌نضوِ◌خ" would actually be romanized as "
hciwdnas" when read left to right as English readers are expected to do.
The rule I am proposing is:
Songs with Arabic metadata must be romanised using the Library of Congress system for romanizing Arabic when there is no romanisation or translation information listed by a reputable source. The same applies to the Source field if a romanised Source is preferred by the mapper.
Additionally:
In the romanization of Arabic, words should be romanized in verse order, and the last letter should be be uppercased. For example in romanizing "◌ س◌ !" the correct romanization should be "!usO"
However, there is also the problem of Judeo-Arabic romanization which differs slightly from traditional Arabic romanization. Judeo-Arabic of course, stems from the Jewish Arabs many who live in Iraq and have adopted a slightly different script with respect to certain nouns and verbs. The most common Jewish Arabs are those from Baghdad. Anyways, I digress.
Attached are examples of Judei-Arabic romanziation:
So I would like to propose the following:
Songs with Judeo-Arabic metadata must be romanised using the Library of Congress system for romanizing Judeo-Arabic where Judeo-Arabic nouns and verbs are being used, and where there is no romanisation or translation information listed by a reputable source. Where Judeo-Arabic words and phrases are not used, traditional Arabic romanization will apply. The same applies to the Source field if a romanised Source is preferred by the mapper.
Yet another language I would like to cover is the Cherokee language, also known as the Tsalagi Gawonihisdi, which is an Iroquoian language of the native CHerokee people to which there are approximately 300,000 tribal members. In terms of syllabary, I again lean to the ALA-LC Romanization table that was prepared by the Library of Congress attached here:
Because the language does not use capitalization, I am wondering if a rule should be made to force lower case on all songs, titles, artist, and sources with Cherokee origins. Below I propose the two following rules:
Songs with Cherokee metadata must be romanised using the syllabary provided by the ALA-LC Library of Congress system for romanizing Cherokee when there is no romanisation or translation information listed by a reputable source. The same applies to the Source field if a romanised Source is preferred by the mapper.
As well as:
Songs with Cherokee metadata must use lower case across Title, Romanized Title, Artist, Romanized Artist, and Source.
Lastly, I would like to bring attention to another pictographic language.
In fact, Chinese is not the only pictographic language left in the world. I'm sorry, Hollow Wings, Chinese is special, but it is not that special. You guys have forgotten about the ancient Egyptian Hieroglyphics. Such a shame.
Below is a chart on monoliteral hieroglyphs and their hieratic equivalents as researched by R. Lepsius in his book
Denkmäler aus Aegypten und Aethiopien Abth. IIAs you can see, not all hieroglyphs have been translated yet, and some are still in the process of being discovered due to many pyramids and ancient Egyptian pyramids currently being lost to time. Therefore I have a few set of rules to propose:
Songs with ancient Egyptian Hieroglyphic metadata must be romanised using the current knowledge of Egyptian Hieroglyphs for romanizing Egyptian Hieroglyphics when there is no romanisation or translation information listed by a reputable source. The same applies to the Source field if a romanised Source is preferred by the mapper.
Additionally, because not all Hieroglyphs have been transcribed yet:
Songs with ancient Egyptian Hieroglyphic metadata that use a hieroglyph that is currently not transcribed should be replaced with "?" until a proper transcription is decided on. The same applies to the Source field if a romanised Source is preferred by the mapper.
Lastly,
Songs with ancient Egyptian Hieroglyphic metadata that use a hieroglyph that is has not yet been documented should be sent to the American Research Center in Egypt for proper processing. This rule mainly applies to mappers who are currently on archaeological digs in Egypt and find a new pyramid and want to map the songs that were uncovered.
I hope this will be of use to you guys, and I hope to see some more fruitful and productive discussion come about.