Perhaps you have to deal with Russian text in your work. Project managers at translation companies, software developers, technical writers, engineers, designers, printers and countless other professionals may need to process or deliver content in Russian — but they may not be able to read it.
How can you tell if what you are looking at is, indeed, coherent Russian text if all you know is that it is not English text? Your best bet is to consult a Russian speaker or someone who can read Russian. However, if such a person is not immediately available, here are some quick and dirty ways of catching corrupted Russian text. Note that we are only talking about the typographic appearance of the text — your text may display correctly and still contain incorrect or nonsensical translation.
The first and most obvious method is to drop the content into an automated translation environment like Google Translator. If you are getting a translation, your text is likely, indeed, in Russian. However, this method will not work if you are working with non-live text, for example, in a flat image or a printout.
While there are several differences in the use of capitalization between English and Russian, the overall idea behind capitalization is fairly consistent in these two languages. Just like English, Russian capitalizes the first letter of a new sentence and proper names. So, if your English content is in all caps and the Russian is in lowercase; or if your content is in normal sentence case and the Russian is in reverse letter case, something is not right.
For instance, the heading of a Colta article looks like this in Russian: В России поставят первый памятник Андрею Белому (“Russia to Erect Its First Andrey Bely Monument”). We can see that the first letter of the sentence and the initial letters of three more words, likely proper nouns, are capitalized. This is what the same sentence looks like when erroneously encoded in ISO-8859-5: “а а аОббаИаИ аПаОббаАаВбб аПаЕбаВбаЙ аПаАаМббаНаИаК ааНаДбаЕб ааЕаЛаОаМб.” Capital letters in the middle of the word are a sign that our content likely got corrupted.
Another way of assessing whether the Russian text in front of you is displayed correctly is to ascertain if any vowels are present in the words. This is a more involved technique, so you may want to run your decision by someone who is familiar with the Cyrillic alphabet.
Russian uses the following vowels (in upper and lower case): Аа, Ее, Ии, Оо, Уу, Яя, Ээ, ы, Ёё, Юю. At least a few of them look like the English letters A, E, O, and Y.
If a Russian word looks like this: сжЦфК═ — apart from the inconsistent capitalization and the presence of a special character, the consonant cluster is another red flag that the word should be checked. This is actually the word Moscow (Москва) displayed with an incorrect character encoding.
Finally, Latin characters in the middle of a Russian word also warrant a thorough check. The challenge here is that some Cyrillic letter looks the same as their Latin counterparts. You may want to refer to a Russian alphabet to see if the letter is really an outlier.
Take the same word Москва (Moscow). If we see something like јЮбЪТР, we notice that the letters j, T, and P could potentially be Latin letters. A quick comparison with the Russian alphabet reveals that T and P are legitimate Russian letters, while j is not.
In any case, if you do not read Russian, you should have a native or fluent speaker proofread your text, provide solutions for any issues, and confirm that it is good to go. However, the techniques above let you do a quick assessment and catch incorrectly displaying Russian text before it goes to publication.