forum

Romanization of Cyrillic

posted
Total Posts
54
Topic Starter
Wafu
There actually are rules about how should some Asian languages be romanized. However Cyrillic has a really huge count of types of romanization and each is used for different country, so there are quite issues with this.

Because this one doesn't use any special characters, is pretty easy to use and is similar to Modified Hepburn romanization which we use for Japanese and also similar to the romanization we use for Chinese, I think it is better to use this to avoid more confusion and inconsistencies:

Cyrillic metadata must be romanized according to this system. This can be ignored if artist provided official romanization.

ë in Russian should be romanized regarding to this:
  1. (ж/ч/ш/щ) + ё = o
  2. (everything else) + ё = yo
If Cyrillic is used to express the pronunciation of a foreign word, use its original spelling:
  1. Гудбай → Gudbay ✗
  2. Гудбай → Goodbye ✓
For simplification, I coded a little program for used Cyrillic languages. You might get it there. I am willing to make changes if more exceptions are found or if another Cyrillic language song is mapped - Then I'd add add that language's option.
Kert
yo = ё
h = х
c = ц
That's for russian
:)
Topic Starter
Wafu

Kert wrote:

yo = ё
h = х
c = ц
That's for russian
:)
I'd agree, but that's quite older romanization and is less similar to Modified Hepburn which we use for Japanese.
//Oh, yeah, ё should be o or yo, I wrote the same for e and ё. Sorry, I'll fix it soon xD
//Fixed the table, ё is now o or yo how it should be. ц and х was kept to keep it similar Japanese romanization we have to use, to keep it consistent with romanization style of the rest (c and h are used in more complicated romanization) because 'c' and 'h' are less defined in many languages. Some romanizations for example romanize ё to ё, which is quite unknown for many people, but this is mostly international.
TicClick
Where did you get that table from? ё is nothing like o, й should be j and ь should translate to '. I would suggest using somewhat official GOST 7.79-2000 (with grave accent ("backtick") replaced with apostrophe) instead, which is basically modified ISO 9. It also sounds and looks more natural to natives.
Kurai

TicClick wrote:

Where did you get that table from? ё is nothing like o, й should be j and ь should translate to '. I would suggest using somewhat official GOST 7.79-2000 (with grave accent ("backtick") replaced with apostrophe) instead, which is basically modified ISO 9. It also sounds and looks more natural to natives.
As a foreigner studying Russian, I am more used to the BGN/PCGN (Russian - Ukrainian) romanisation system (and the French system too, but let's not talk about this one haha). I have never seen any text being romanised using GOST and having х getting romanised to x instead of kh or щ to shh instead of shch looks kind of awkward to me. Especially when you have some maps already using the BNG/PCNG one.

Also, many other languages use Cyrillic, so I would personally ask people to refer to this when they want to romanise their song titles from cyrillic to the roman alphabet (who knows, maybe we'll have some more amazing Bulgarian music in the future! please no more Azis, I beg you).
TicClick
Whoops, missed kh and shch, although I was gonna mention that. Yeah, it's quite a flaw, х being romanized to x is dumb, and BGN/PCGN looks even more acceptable (ё → yo is probably the only clarification it needs). As for GOST, it's mostly used in official pages when people take a random decision that they should go "by the rules".

But still, й = y?..
Kurai
ё → yo makes sense. It makes me think of the romanisation of "Михаил Горбачёв" to "Mikhail Gorbachev" instead of "Mikhail Gorbachov". Everyone pronouces Горбачёв the wrong way (at least in France) because of this.

As for й → y, I'm quite used to seeing this in texts as "й" sounds more like "y" than "j" in most (non-slavic) languages making it more readable for foreign people who can't read cyrillic. Though my teachers told me that both "y" and "j" are acceptable transcriptions.
Topic Starter
Wafu
@Kurai: This is BGN/PCGN, but ё is the undefined sound, because it is different in every language + it's unicode, so it cannot be used. However, from Russian it is pronounced o or yo. BGN/PCGN is old, but the most reliable system, because it reflects the real pronunciation we are used to. ё is often used as o or yo (even in BGN/PCGN, but it has not been updated), because it is not 'special sign' + the pronunciation is more defined.

@TicClick й = y - Yeah. j is also acceptable, but is not international, because most countries read J as 'dzh'. It is also the same in hepburn - yu in Japanese reads the same as йу in Russian, so there really is a lot of similarity.

I would really, as Kurai said stick to https://www.gov.uk/government/uploads/s ... sation.pdf
There is even explained when yë (respectively, because it is more defined - yo) is used and when only ë (respectively o for the same reason) is used.
Krfawy
Question time, can I write Polish pronounciation in the tags? You know, for example we have 'Катя Миронова' so I'd translate it as "Katya Mironova // Katia Mironova // Katiya Mironava" and so on, but could I put "Katja Mironowa" to the tags?

EDIT: How about 'й' as 'yee'?
Topic Starter
Wafu

Krfawy wrote:

Question time, can I write Polish pronounciation in the tags? You know, for example we have 'Катя Миронова' so I'd translate it as "Katya Mironova // Katia Mironova // Katiya Mironava" and so on, but could I put "Katja Mironowa" to the tags?

EDIT: How about 'й' as 'yee'?
Tags are used for searching, thus you are obviously allowed to use additional romanizations in tags.

й as yee would be maybe too complicated because it mixes more ways of pronunciation. If you compare to Japanese Hepburn, y and y in this romanization is the same, so people are used to this. If 'ee' is taken from english, most people would read this as 'ji' (in Polish pronunciation obviously, I am talking to Krfawy :)), but in real you read й as 'j' only (again in Polish), but that's only for some languages, it's not very international so it cannot be used, so referring to the table Kurai sent would be probably the best apart from ë.
Krfawy
To be honest as a Pole I can't find any difference between 'j' and 'ji' since they sound the same so that's why I was asking about 'yee'. :D
Topic Starter
Wafu

Krfawy wrote:

To be honest as a Pole I can't find any difference between 'j' and 'ji' since they sound the same so that's why I was asking about 'yee'. :D
Oh... yeah, I see it now, so it could even be just 'y' :D
Topic Starter
Wafu
Updated the decrtiption a bit.
Sieg

TicClick wrote:

I would suggest using somewhat official GOST 7.79-2000 (with grave accent ("backtick") replaced with apostrophe) instead, which is basically modified ISO 9. It also sounds and looks more natural to natives.
Topic Starter
Wafu

Sieg wrote:

TicClick wrote:

I would suggest using somewhat official GOST 7.79-2000 (with grave accent ("backtick") replaced with apostrophe) instead, which is basically modified ISO 9. It also sounds and looks more natural to natives.
I'd say BGN/PCGN would cover all Cyrillic languages, thus using GOST would then make it inconsistent, I suppose.

Also let's be honest:
й - j (j is read like this mostly in Slavic languages, most of others would read j as дж)
х = x (undefined in most languages, will be confused with 'ecs' from other languages)
ц = cz, c (will be confused with 'ch', for example in Polish, and also many international words read cz as 'ch')
щ = shh (most people will read it the same way as ш, because they get confused when seeing 2x h in a row, they'll probably tend to say it longer or something, but they definitely won't know there is 'ch' on the end of the sound, so 'shch' from BGN/PCGN defines it more)
Usage of ` and ' is also not very good there.
For comparsion, I romanised в полете с крыши дома мы выпьем весь блейзер в мире in both transliteration types.
I'd say it would be the best choice, because it is easier to read, doesn't need many changes and looks aesthetically better than GOST.
cr1mmy
Topic Starter
Wafu

cr1m wrote:

fcuk you
https://en.wikipedia.org/wiki/Yo_(Cyrillic )
https://ru.wikipedia.org/wiki/%D0%81
I already sent this link to the map because of which this started, anyway no offense is needed.
Kobold84
So I think that Ё = yo in every (not sure, but anyway) case except when this letter goes after Ж, Ч, Ш and Щ, then it's o.
For example "подошёл" — "podoshol", "чёрный" — "chorniy" (or "chornyy"?)
Topic Starter
Wafu

Kobold84 wrote:

So I think that Ё = yo in every (not sure, but anyway) case except when this letter goes after Ж, Ч, Ш and Щ, then it's o.
For example "подошёл" — "podoshol", "чёрный" — "chorniy" (or "chornyy"?)
The system itself explained that: "The character should be romanized yë initially, after the vowel characters a, e, ё, и, о, у, ы, э, ю, and я, and after й, ъ, and ь. In all other instances,it should be romanized ё."
But we'd just replace ë with o.

чёрный would be chornyy if you asked.

Also when official romanization was recommended, I hope you did not forget that it must be international - which is another reason why not GOST.
TicClick

Wafu wrote:

Kobold84 wrote:

So I think that Ё = yo in every (not sure, but anyway) case except when this letter goes after Ж, Ч, Ш and Щ, then it's o.
For example "подошёл" — "podoshol", "чёрный" — "chorniy" (or "chornyy"?)
The system itself explained that: "The character should be romanized yë initially, after the vowel characters a, e, ё, и, о, у, ы, э, ю, and я, and after й, ъ, and ь. In all other instances,it should be romanized ё."
While I accept the system Kurai proposed, I am afraid the above explanation is still unsatisfactory; for ё after (Ж, Ш, Ч, Щ), o is correct, because these consonants are always either soft or hard, and syllables жё, шё, чё and щё sound like they really do have o. But it's false for the rest of consonants: до doesn't equal дё, бо is different from бё, et cetera, et cetera.

In short:
  1. ё + ж/ч/ш/щ = o
  2. ё + everything else = yo
For loan words, their romanization should obviously be taken from the original language.
Topic Starter
Wafu

TicClick wrote:

While I accept the system Kurai proposed, I am afraid the above explanation is still unsatisfactory; for ё after (Ж, Ш, Ч, Щ), o is correct, because these consonants are always either soft or hard, and syllables жё, шё, чё and щё sound like they really do have o. But it's false for the rest of consonants: до doesn't equal дё, бо is different from бё, et cetera, et cetera.

In short:
  1. ё + ж/ч/ш/щ = o
  2. ё + everything else = yo
For loan words, their romanization should obviously be taken from the original language.
Oh! You did use Kurai's system on Pjecoo's map, but changed rules a bit - That's okay anyway. However if we take уйдёшь and consider you agreed with Kurai's system, then й should be replaced with 'y' in that case. There is definitely not soft I, but Y like in Yellow.

On the other hand, I agree with ё + ж/ч/ш/щ = o but not completely with ё + everything else = yo. It is true that шё really reads sho = You are right, there is definitely not 'yo'. For example Вё will really read vyo, BUT зё for example doesn't read zyo but Z becomes softer, thus it sounds like zho - дё also, it doesn't read dyo, but sounds like dho - That imo shouldn't be romanized because that's just Russian accent.

девочки for example sounds exactly the same as d in уйдёшь, while there definitely isn't 'y' after d, it would be romanized as devochki not dyevochki - even through it reads exactly the same.

That's why I complained on Pjecoo's map, because it doesn't sound like dYosh, but just dosh with the Russian accent, which makes D soft, not o. But this could be questionable, I'll update the post anyway after we decide which letters should romanize to yo and o. :)
TicClick

Wafu wrote:

девочки for example sounds exactly the same as d in уйдёшь, while there definitely isn't 'y' after d
Excuse me? These are two different syllables with two absolutely differently sounding vowels. I still think you have trouble getting the pronounciation right, and at this point I'd like to ask where you take it from, considering the flag in your profile.
Aka
would you pronounce дёшево as doshevo?
if you just put o after do, nobody would understand that d is supposed to be soft. it would look completely same as dozhdik (дождик) but these 2 words has completely different vowels, they are pronounced differently. its ё and о. yo and о


yes, you can hear a tiny o at the end of ё but its completely not similar to straight о vowel, its rather soft and more emphasis goes on y sound, which makes the whole syllable sound softer
Topic Starter
Wafu

TicClick wrote:

Excuse me? These are two different syllables with two absolutely differently sounding vowels.
Problem is I didn't talk about syllables. I know де is not the same as дё - There is difference of 2nd letter. BUT I talked about letter, not syllable. However д in both cases reads soft and that's not caused by Y, that's just accent, you have to get used to it no matter where are you from, everyone will find out after reading multiple words, this is thing which people must deal with.

Let's compare to Japanese. 今日 is きょう in Hiragana, which is Kyou after transliteration - You do not read U but you read long O, so in this case people also had to get used to this - These are just rules of the language itself. You might say people will be confused then and will read it wrong, but many languages have these problems. Even English does, for example read and read - Seems like the same words, but one of this might be past tense, that means both are written the same way, but one reads with long I instead of 'ea' and one reads only E instead of 'ea'. That's just getting to know the language.

TicClick wrote:

I still think you have trouble getting the pronounciation right, and at this point I'd like to ask where you take it from, considering the flag in your profile.
I don't to be offensive, I'll just take this calm way. What you are providing is not an argument but a fallacy (not the extreme point, but you pull unrelated things there). Firstly, you judge that I have problem with pronunciation without further explanation, secondly, you decrease my right to discuss about this based on the fact I am not Russian nor from country using Cyrillic. I'd appreciate if we don't change topic to "You are not Russian, you don't know anything about our pronunciation." but keep providing arguments that are really related, everyone is equal there.

To be more accurate, for example Scientific transliteration, which was then turned to GOST (and partialy to ISO 9), was based mostly on Czech alphabet and pronunciation.
That means google translate could be a good tool to show difference between the ways of romanizing the уйдёшь word. Only bad thing and inaccuracy here is that google is using retarded accents, so sometimes it might be messed up. Play the pronunciation and see the difference.

1st way of romanization was uidesh
2nd way was mine, uidosh (though it should've been uydosh) - If you imagine the Russian accent, isn't it similar to уйдёшь
3rd was yours uidyosh (or uydyosh if we refer to the BGN/PCGN, that's not problem we solve now, nothing to talk about now) However, in the original Russian version, there is nothing what sounds like yo, just the D is soft, which is just the accent, same as above mentioned "read" & "read" which reads different in 2 exactly the same forms. If you don't agree that it's just accent, then removing the caron (in this case caron=accent), your romanization would sound like this, so it would be simpler to just swap to IPA to avoid confusion with accent if you disagree that accent and the language's rules must be considered when reading as I already mentioned twice with "read" & "read" thingy.

Aka wrote:

would you pronounce дёшево as doshevo?
if you just put o after do, nobody would understand that d is supposed to be soft. it would look completely same as dozhdik (дождик) but these 2 words has completely different vowels, they are pronounced differently. its ё and о. yo and о


yes, you can hear a tiny o at the end of ё but its completely not similar to straight о vowel, its rather soft and more emphasis goes on y sound, which makes the whole syllable sound softer
Already explained above - Each language has different accent thus words which are written the same way could have different pronunciation. I agree that the difference might be shown better, but D would not be softened by Y after it, because that's in conflict with the chosen romanization rules, where Y reads as I or J (in Slavic languages) and there are no cases where Y softened the letter before it, the accent just does, that's not thing which you afflict by writing.

That would even force us to make complete transliteration of all Russian words, because there are many exceptions where it reads different way than it's written.

For example Южный Поток - Where it is romanized as Potok, but 1st O with accent makes it sound A, but the second O is still normal O.
On the other hand почка is romanized to pochka, where O after P reads like O, not like A in first case, same case.
Same applies to the пот - definitely reads pot, not pat.

Seriously I just took 3 random words starting on по - now consider how many words starting with по could exist. As I said, it depends, accent changes in most languages even though the non-Unicode transcription doesn't.
TicClick

Wafu wrote:

TicClick wrote:

I still think you have trouble getting the pronounciation right, and at this point I'd like to ask where you take it from, considering the flag in your profile.
I don't to be offensive, I'll just take this calm way. What you are providing is not an argument but a fallacy (not the extreme point, but you pull unrelated things there). Firstly, you judge that I have problem with pronunciation without further explanation, secondly, you decrease my right to discuss about this based on the fact I am not Russian nor from country using Cyrillic. I'd appreciate if we don't change topic to "You are not Russian, you don't know anything about our pronunciation." but keep providing arguments that are really related, everyone is equal there.
That wasn't a fallacy. I had a suspicion (and I still do) that you either memorized the word or the sound itself wrongly, or aren't very familiar with it, because you keep suggesting the same wrong solution over and over, over and over, despite multiple people trying to reconvince you by giving proper examples. After this, I have a full right to question your words and the sources of your knowledge — at least due to the fact that what you suggest differs from the reality and twists the romanized words in a weird way.

Now on Google Translate.

Wafu wrote:

1st way of romanization was uidesh
Which was mistakingly chosen due to a very widespread and quite common habit of substituting ё with e, but only for spelling purposes, thus sometimes creathing homographs.

Wafu wrote:

2nd way was mine, uidosh (though it should've been uydosh) - If you imagine the Russian accent, isn't it similar to уйдёшь
It is not; what you have in input field is not d, but d', which is apparently a soft consonant; I guess this is the root of our misunderstanding. In no way "do" is similar to "d'o"! You can say that 'o ≈ ё, but ё and o, again, are DIFFERENT.

Wafu wrote:

3rd was yours uidyosh (or uydyosh if we refer to the BGN/PCGN, that's not problem we solve now, nothing to talk about now) However, in the original Russian version, there is nothing what sounds like yo, just the D is soft, which is just the accent, same as above mentioned "read" & "read" which reads different in 2 exactly the same forms. If you don't agree that it's just accent, then removing the caron (in this case caron=accent), your romanization would sound like this, so it would be simpler to just swap to IPA to avoid confusion with accent if you disagree that accent and the language's rules must be considered when reading as I already mentioned twice with "read" & "read" thingy.
What you have in input field is not uidyosh. Why is there a hyphen out of the blue? It makes it sound with a slight pause, as if you used a hard sign letter: уйдъёшь. This is wrong and nowhere russian-ish.

I have to ask you again to refrain from mixing two languages, as they are different, and your above post, especially — especially! — the pronounciation part, just proved that.
Topic Starter
Wafu
Please, read again, as I already stated this more than 3 times.
According to the system Kurai, Y in ё was only used when there was Y really able to hear and never was used to soften the previous letter.
You have to tank that it is just accent what makes the letter soft. There is nothing like dYO - D is soft, nothing else.

Same thing as I stated here:

Wafu wrote:

For example Южный Поток - Where it is romanized as Potok, but 1st O with accent makes it sound A, but the second O is still normal O.
On the other hand почка is romanized to pochka, where O after P reads like O, not like A in first case, same case.
Same applies to the пот - definitely reads pot, not pat.
If we want to make our own romanization, then we will have to transliterate every single word separately - or easily, deal with it that there is accent which causes inconsistencies (same as mean and mean, which you didn't even react to, that was the most important part of my post)

TicClick wrote:

I have to ask you again to refrain from mixing two languages, as they are different, and your above post, especially — especially! — the pronounciation part, just proved that.
I am not mixing two languages, I've shown you example of scientific romanization for Russian which was based on past Czech language, then was formed to GOST (for Russians) and at the same time ISO 9 was developed which was based on scientific one, but is just more international. So please, don't tell me I am mixing two different languages.

To be honest, if I wanted to be offensive, I could just make tl;dr counter-argument for your "you either memorized the word or the sound itself wrongly, or aren't very familiar with it, because you keep suggesting the same wrong solution over and over, over and over" - I can say you are Russian, thus if you are using romanization, then the one which is aimed to Russian people.

Even "й should be j" and then "But still, й = y?.." has proven you don't know much about international romanization (when you wanted to use Slavic language for international).

TicClick wrote:

despite multiple people trying to reconvince you by giving proper examples
Oh? As far as I know, you were only person trying to re-convince my opinion, but I have still not gotten the "proper example" which I provided (почка or пот thingy - That's not unrelated, it's the same, but you just don't agree with the difference being just an accent and accent shouldn't change the way it is romanized because we would have to solve every single word)
TicClick

Wafu wrote:

According to the system Kurai, Y in ё was only used when there was Y really able to hear and never was used to soften the previous letter.
You have to tank that it is just accent what makes the letter soft. There is nothing like dYO - D is soft, nothing else.
So you take ё as YO, with no exceptions? You do realize that you change the word itself by replacing ё with o and make it different, right? If you want the letter itself out of any words so much, at least translate it to 'o, but don't destroy it completely, this is just plain wrong and gives absolutely different results.

Wafu wrote:

Same thing as I stated here:

Wafu wrote:

For example Южный Поток - Where it is romanized as Potok, but 1st O with accent makes it sound A, but the second O is still normal O.
On the other hand почка is romanized to pochka, where O after P reads like O, not like A in first case, same case.
Same applies to the пот - definitely reads pot, not pat.
If we want to make our own romanization, then we will have to transliterate every single word separately - or easily, deal with it that there is accent which causes inconsistencies (same as mean and mean, which you didn't even react to, that was the most important part of my post)
Potok goes with o because it's spelled like that and has always been, we don't judge words solely by their sounding. In order to have our own romanization, we don't have to romanize every single word, that defies the purpose of romanization system. Instead, we need a set of rules, at least for the most problematic letters, and some sort of a rule is already suggested by, say, Kobold84 in the post above. I don't really see what's so important about the destroying a separate letter of the alphabet completely, along with the meaning it bears, other that it's quite a hazardous action. Going that route, you could just deny hard and soft signs as well and not take any modificators, that is, accents and other diacritic signs, into account. Why, tell me?

Wafu wrote:

TicClick wrote:

I have to ask you again to refrain from mixing two languages, as they are different, and your above post, especially — especially! — the pronounciation part, just proved that.
I am not mixing two languages, I've shown you example of scientific romanization for Russian which was based on past Czech language, then was formed to GOST (for Russians) and at the same time ISO 9 was developed which was based on scientific one, but is just more international. So please, don't tell me I am mixing two different languages.
Unfortunately, you are, or explain why you picked Czech language in Google Translate, along with putting additional signs that may look fitting to you, but in fact alter words and change them up to the point where they can't be considered as romanized correctly.

Wafu wrote:

To be honest, if I wanted to be offensive, I could just make tl;dr counter-argument for your "you either memorized the word or the sound itself wrongly, or aren't very familiar with it, because you keep suggesting the same wrong solution over and over, over and over" - I can say you are Russian, thus if you are using romanization, then the one which is aimed to Russian people.

Even "й should be j" and then "But still, й = y?.." has proven you don't know much about international romanization (when you wanted to use Slavic language for international).
That, however, does not mean that my proposals are invalid; both y and j are acceptable (hell, even i is), it's just the first one that at least seemed more natural to me due to usage of й.

Wafu wrote:

TicClick wrote:

despite multiple people trying to reconvince you by giving proper examples
Oh? As far as I know, you were only person trying to re-convince my opinion, but I have still not gotten the "proper example" which I provided (почка or пот thingy - That's not unrelated, it's the same, but you just don't agree with the difference being just an accent and accent shouldn't change the way it is romanized because we would have to solve every single word)
If that was about "how about we romanize everything based on how it sounds", see the top of my reply. O is o, but ё isn't. Regarding multiple people, you may want to re-read both this thread and the other one where I jumped in, t/323555. For what it's worth, you may want to ask the host (Pjecoo) why they ended up using yo instead of o.

It seems that we're going in circles. I've got so far that you want to remove the so-called accent completely; how about we make a trade-off and use 'o, if you don't want yo so badly, thinking that ё -always- has to be a separate sound? Romanizing words letter by letter can give you both right and wrong results, depending on the word itself.
Topic Starter
Wafu
Unfortunately, I have to stop making arguments, because we really are stuck in a circle - You don't want to agree with my opinion, I don't want to agree with yours. Respectively, ë is not o nor yo, it is just ë, thus we completely have to kick-off the rule from official BGN/PCGN system and stick to the first way where 'y' was used to soften the previous sound, even though it is a bit different in the cases written in the system, but let's not care about it anymore.

I wanted to prove that we don't have to romanize the accent in all cases, IPA is there to show complete accent, this transcription requires common sense then.
However, don't even tell me that 1st 'о' in поток has the same accent like 'о' in почка, because I am 100% sure it is pronounced different way.

About j, y and i - If we were to refer to BGN/PCGN on Pjecoo's map, then it should be y, because it actually is used by the table Kurai provided - why do we alter the romanization even more, ruining ë is really enough. j is for slavic, i is for short and mainly soft i, y, depending on next letter, you will always recognize is just the sound we want. If we use something else we combine more transliteration systems into one - That's destroying too many things at once, so let's, please stick to y for slavic 'j' and then to previously said rules:

ё + ж/ч/ш/щ = o
ё + everything else = yo

Edited the post again, I hope we won't need to make more arguments after I at least tanked my opinion being kind of not 100% noticed. All should be fine now, I guess?
TicClick

Wafu wrote:

However, don't even tell me that 1st 'о' in поток has the same accent like 'о' in почка, because I am 100% sure it is pronounced different way.
If by accent you mean [[stress]], yes, these are different, otherwise, if we talk about [[regional accent]], it depends: there are locations where both words would be pronounced exactly the same way they're written.

Wafu wrote:

About j, y and i - If we were to refer to BGN/PCGN on Pjecoo's map, then it should be y, because it actually is used by the table Kurai provided - why do we alter the romanization even more, ruining ë is really enough. j is for slavic, i is for short and mainly soft i, y, depending on next letter, you will always recognize is just the sound we want. If we use something else we combine more transliteration systems into one - That's destroying too many things at once, so let's, please stick to y for slavic 'j' and then to previously said rules:

ё + ж/ч/ш/щ = o
ё + everything else = yo

Edited the post again, I hope we won't need to make more arguments after I at least tanked my opinion being kind of not 100% noticed. All should be fine now, I guess?
Sounds alright. I can also agree on й being y.

There's also one more reason of why that should be yo: it belongs to the same group as ю, я and somehow е (which, despite being an exception and translating to plain e, does get romanized to ye, when a word starts from it, but that's already in notes).

Made a brief check against the rest of BGN/PCGN, found nothing else that looks/sounds erroneous. I guess that's all? Note #3, however, is unnecessary, but I don't think anyone will ever need it.
Topic Starter
Wafu
I meant stress in this case, sorry if it sounded confusing. I partially meant regional accent for some words, but didn't actually say it well.

Anyway, I think ë should be now fine and I removed the note #3 because it conflicts with the stuff below it.
TicClick
Ouch. I nearly missed ь and ъ; please add them to жчшщ as well. ← IGNORE THIS, my mind flipped for a second

So, the final version is as follows (letters that influence ё go first, then ё itself):

Cyrillic metadata must be romanized according to this system. This can be ignored if artist provided official romanization or the word is supposed to express foreign word.

ë in Russian should be romanized regarding to this:
  1. (ж/ч/ш/щ) + ё = o
  2. (everything else) + ё = yo
If Cyrillic is used to express pronunciation of [[calque]] word, use its original romanization (for example, Гудбай would not be Gudbay, but Goodbye).
Topic Starter
Wafu
Seems we both already agreed on this. My apologizes for causing confusion due to the "stress", not "accent" thingy and I don't see any problem anymore, romanization will be a bit different, but yes, it would require more work to think about way of making the previous character soft. Tbh. 'o would look ugly, so yes, it's not 100% official, yet simple.

I guess someone knowledge of Cyrillic might check this for more problems, couldn't find more and I think everyone who posted on last two pages agreed with this.
Kurai
I see that a consensus has been reached. I am not going to put more oil on the fire as of why yo > ye, it'd be unneeded.

Cyrillic metadata must be romanized according to this system. This can be ignored if artist provided official romanization.

ë in Russian should be romanized regarding to this:
  1. (ж/ч/ш/щ) + ё = o
  2. (everything else) + ё = yo
If Cyrillic is used to express the pronunciation of a foreign word, use its original spelling:
  1. Гудбай → Gudbay ✗
  2. Гудбай → Goodbye ✓

I believe it is less confusing this way.

Wafu wrote:

Seems we both already agreed on this. My apologizes for causing confusion due to the "stress", not "accent" thingy and I don't see any problem anymore, romanization will be a bit different, but yes, it would require more work to think about way of making the previous character soft. Tbh. 'o would look ugly, so yes, it's not 100% official, yet simple.
"o" doesn't really look ugly, as long as the romanisation system we use is consistent and helps foreigners to read the words properly, then it does what it is supposed to be doing.

If you agree with this amendment I did , I'll change the OP and bubble this thread, otherwise I'll just bubble.
Topic Starter
Wafu

Kurai wrote:

"o" doesn't really look ugly, as long as the romanisation system we use is consistent and helps foreigners to read the words properly, then it does what it is supposed to be doing.
o isn't ugly, yet 'o could be. There could be case with 'o'' in text - That was recommended instead of yo for softening the previous letter, but I think yo looks simpler than so many apostrophes.

When it comes to the way how you've rewritten the rule, I pretty much agree with this form, better. Replaced in main post.
Kurai

Wafu wrote:

o isn't ugly, yet 'o could be.
Nevermind I just can't read.
Kurai
Got Tic's approval by PM so I'm just going ahead and bubble. If anyone has anything to add, feel free to mention it, it's still open to discussion.
Kert
Why is х still translated to kh instead of only h?
I am totally reading Mikhail Gorbachov as Микхаил Горбачёв. In reality it sounds as h in english word 'have.'
Won't foreigners do the same?
Topic Starter
Wafu

Kert wrote:

Why is х still translated to kh instead of only h?
I am totally reading Mikhail Gorbachov as Микхаил Горбачёв. In reality it sounds as h in english word 'have.'
Won't foreigners do the same?
Well, I guess we don't change kh because we would need to alter the rest of the romanization. + h in many languages reads different - hole, hentai, haját. (Won't foreigners do the same? - I think the possibility is lower) While for example Kazakhstan definitely reads as Казахстан and it is used worldwide. I think that's quite reasonable, when H after a letter was always changing the way how the previous letter is pronounced.

Also, as another example, Схоластика = Skholastika
If it was only H it would be romanized as Sholastika and people would tend to read Шоластика.
After this I don't see reason to change anything more, I think kh is okay, maybe there is someone to completely kick it off so let's see. :)
Aka
В английском, кстати, довольно часто пихают букву к. Мой город на английском будет называться Kharkov
Это скорее нечто, к чему привыкать надо, что ли
Kert
теперь я понимаю откуда взялось АКХМЭД
АКХЪХЪХЪХХхъъ

I didn't realize it can have problems with 's' as a letter before 'h'. I don't see any way to avoid this other then making exception for letter х going after с,з,ж,ч,ш,щ (but I honestly can't remember words with зх,жх,шх,щх,чх in them, except maybe Апчхи - the sound of sneezing)
Казакхстан кхорошо прокхладно вот локх

Interstingly I think people don't add k before h when romanizing japanese? hayate haru nihon
TicClick

Kert wrote:

I KIRR YOU

kh потому, что просто h часто проглатывается и звучит по-другому: eyesight, archive, apathy, и это как раз та самая проблема с чтением, которую пытаются избегать, выделяя h как отдельный звук с помощью k. Более того! Пресловутый Горбачёв имеет свою собственную страничку на вики, и там его имя написано именно через kh.
Topic Starter
Wafu

Kert wrote:

Interstingly I think people don't add k before h when romanizing japanese? hayate haru nihon
Well.. that's for a different reason. Russian does have separate letters - one by one, Japanese does have only syllables + single a i u e o. Each Japanese syllable ends with a/i/u/e/o so there is no way to find a word which would cause issue with non-vowel+h in Japanese. At least never saw such.
Also х is contrast between K and H, H in most languages is different sound from it - Actually I found only 2 languages that use Latin alphabet and pronounce H similar way as х, most others were missing the contrast.

Maybe difference between H and KH in IPA could help there:
http://en.wikipedia.org/wiki/Voiced_glottal_fricative (h)
http://en.wikipedia.org/wiki/Voiceless_velar_fricative (kh)
Kurai
Казахстан → Kasakhstan in almost every language.

Honestly, х sounds more like kh than h. I would totally misread that h in every word if х were to be romanised to h.

In most languages, as Wafu pointed out, h comes from deep in the throat while kh comes from the palate making 2 different, yet a little bit similar, sounds.

Maybe you don't really differenciate those two sounds since that "h" sound that exists in almost every latin languages (and others) doesn't really exist in Russian. That's same reason as for why Chinese pronounce "b" and "p" the almost the same way for example, as in Chinese "b" and "p" sound the same.
Topic Starter
Wafu
Just coded some stuff to simplify transliteration for already used languages in osu! - Edited the thread so people might access it easily.
TicClick

Wafu wrote:

Just coded some stuff to simplify transliteration for already used languages in osu! - Edited the thread so people might access it easily.
Misses the first letter at least in Russian:

TicClick wrote:

Ёлка
Елена в ящике

Romanization Tool wrote:

Olka
Elena v yashchike
EDIT: also "ae":

TicClick wrote:

Чаепитие

Romanization Tool wrote:

Chaepitiye
EDIT2: if it's simply replacing two adjacent characters, you need to find another way, so that it works in cases like:

TicClick wrote:

Змееед

Romanization Tool wrote:

Zmeyeed
Topic Starter
Wafu
Updated that, tho I'll be changing a little of the algorithm so there won't be these problems, maybe I'll make it over weekend.

//Edit

Yeah, I know about these below TicClick, I'll be updating everything over the weekend.
TicClick
Just for the record, none of these currently work:
SPOILER

TicClick wrote:

Еее
еее
ЕЕе
еЕЕ
ЕЕЕ

Romanization Tool wrote:

Yeeye
eyeye
EYee (should be YeYeye)
eEYe
EYeE

EDIT: it appears to be broken for Ae/aE everything that has е and ё with different usage of mixed case.
Topic Starter
Wafu
Changed the algorithm just now. I didn't add all the exceptions, but now it takes stuff to replace from a file. That means if you want additional language, you can just create a txt file in folder and put all the language's characters there, much better than recompiling program all the time. It's way a bit brute force, but there are not that many exceptions, so this could actually make possibility for both community and personal usages, better than everything being locked in the compilation with no access to it.

I think many errors were fixed that way, if there is some exception, it is enough to put it on the top of the .txt file to avoid error someone could find. If you did, let me know what messes up, it will be much easier to fix changes or errors and anyone can do it. :)
TicClick
so you just added these 5 to the list instead of fixing the core issue?.. nice
Instead of blindly replacing substrings, I suggest you to iterate over input and romanize it symbol by symbol, considering the previous letter; there's not going to be exceptions longer than 2 symbols anyway
Topic Starter
Wafu

TicClick wrote:

so you just added these 5 to the list instead of fixing the core issue?.. nice
Instead of blindly replacing substrings, I suggest you to iterate over input and romanize it symbol by symbol, considering the previous letter; there's not going to be exceptions longer than 2 symbols anyway
Well, the point is - Once the file is complete, it will work and will be editable and anyone will be able to include additional languages which won't eventually be provided, in that case I cannot just add exceptions for Russian, or I could remake this to only Russian with all the exceptions built in and keep this one for custom languages, may be? I definitely didn't do this to avoid the core issue, but I cannot fix core issue without building-in all the exceptions for each language which I found to be insensibly long, but well, seems I'll do a special version for Russian only.
show more
Please sign in to reply.

New reply