This article is part of a larger guidebook by RuNet Echo to help people learn how to conduct open-source research on the Russian Internet. Explore the complete guidebook at the special project page.
Conducting open-source research can be difficult, and it’s even more challenging when you cannot read or write in the language of your research topic. Thanks to the Internet, however, even these obstacles don’t make it impossible for people who don’t speak Russian to conduct such research in Russian. To do so, you need to have equal parts patience and technical knowhow in navigating search algorithms and translation tools. This guide identifies and explains free resources that allow you to conduct Russian-language open-source research without knowing the language. Here you’ll also learn how to navigate search algorithms on Google and Twitter to retrieve usable information.
If you have little or no Russian language skills, then translation software is an absolute necessity for conducting research on Russian open-source information. Google Translate is the most versatile tool for translation, offering many features you likely know about already, along with some others that might be less familiar.
The normal translation function of Google Translate is certainly already obvious to you — copy and paste in Russian (or type in English) text, and receive a translation. You can also enter a website’s URL into the translation box and see the entire page translated into any given language.
If you have trouble reading the Russian alphabet, Google Translate’s “drawing” function, which allows you to outline letters using your mouse or trackpad, converts the sketches into editable, translatable text. See below for an example with a Магнит (“Magnit,” meaning “magnet”) supermarket storefront:
As you draw your text, Google automatically suggests editable text to match your sketch. Sometimes the first suggestion isn’t always correct, so be sure to review the options.
In an especially useful feature, Google’s tool can also recognize cursive handwriting, which can greatly differ from printed Russian text. Below, the phrase до свидания (“do svidaniya,” or “goodbye”) is successfully rendered into type by Google’s handwriting tool:
As with English, there are a seemingly endless number of bureaucratic acronyms in Russian, and the website Sokr.ru helps navigate many of these puzzles. For decoding location-specific language, especially useful is this Russian Wikipedia entry, which lists the common abbreviations for various localities in Russia (cities, towns, villages, and so on).
For example, the following text is found in the title of a video on VK, put out by a pro-separatist video channel. Using the two tools outlined above, it’s possible to decipher the information, even without speaking Russian.
г.Донецк Ленинский р-н обстрел РСЗО “Ураган”
There are three abbreviations here, all of which are very common among videos related to the Ukrainian conflict:
- г. Донецк: The first listing for “г.” On Sokr.ru brings up “город” (city), which is also the second entry on the Wikipedia page listing abbreviations. This describes the city of Donetsk (Донецк).
- Ленинский р-н: Both the Sokr.ru website and Wikipedia say this means “район” (region/district) for this abbreviation. This describes the Lenin (Ленинский) district.
- РСЗО: This has nothing to do with towns, so we must rely on Soku, which tells us that this acronym is short for “реактивная система залпового огня ракетная система залпового огня” (reactive system of volley fire / rocket system of volley fire). The English equivalent to this is “multiple rocket launcher system” (MLRS), which in this case referd to the Uragan (“Hurricane”) MLRS.
Therefore, the title of the video is “City of Donetsk Leninskiy district shelling MLRS ‘Uragan’”
Searching in Russian
In English, words rarely change based on their role in a sentence, but they occasionally will add a letter to indicate count or possession (soldier, soldiers, soldier’s, soldiers’), or in the case of some pronouns to indicate subject or object (who or whom, he or him, she or her, and so on).
Things are not so simple in Russian, as there are about a dozen forms for each word (indicating singularity or plurality and case), and several different conjugations of verbs. Depending on its role in the sentence (subject, direct object, indirect object, plural, plural direct object, and so on), the word “cat” can have the following nine declined (changed) forms: кошка, кошки, кошке, кошку, кошкой, кошек, кошкам, кошками, кошках. Making things more complicated, each of these spellings returns different results in a Twitter search—for example, the simple “cat” or “cats” forms of the noun will bring back far more results than advanced searched for “with a cat” or “with cats.”
Wiktionary provides the declined forms for most words in its Declension section, along with verb conjugations. For example, the Wiktionary entry for кошка (cat) shows the singular and plural forms for each of the six Russian cases. When carrying out searches, be sure to remove the accent mark (the small mark on the о́, as seen in ко́шка) from the letters by pressing backspace in front of the accented letter.
When searching on Google, this is not much of a problem, as the search algorithms are able to include many of the forms of a word in its searches. (When you enter the Russian word for cat (кошка) in a search, relevant results that include the declined forms—cats, to a cat, of a cat, on a cat, etc.—will also come up.) Therefore, unless you put the word in quotation marks to specify the results, there is no need to worry about noun and verb forms when searching on Google in Russian.
On Twitter, however, it’s a different story. Twitter can only return results for the exact letters given, and is not able to detect changes in verbs (such as “run,” “ran,” “runs”) or nouns.
If you were trying to find out information about the evacuation of a building in the city of Kazan, you can get drastically different search results, depending on the word forms you enter in your search. For example, try two searches with forms of the verb “эвакуировать” (to evacuate) with the search strings of Казань (normal form of “Kazan”) and в Казани (“in Kazan”).
- Эвакуировать Казань – Evacuate Kazan (no changes to verb/noun forms):
Searching Google News, the results include various forms of both Kazan and Evacuate, but it will also bring back the noun “evacuation” (эвакуация) and adjectival form of Kazan in various declensions:
On Twitter, you receive far fewer results, as this form of the “evacuate” verb is rarely used in tweets and news articles. As you see in the results below, there is a 4-month gap between three tweets, indicating that there is either hardly any information about this topic, or the search phrase needs to be revised.
2. Эвакуировали в Казани (“evacuated in Kazan”)
Using the past tense plural form of “evacuate” (common in passive constructions) yields far more useful results than the first search terms. (If you look back to the results from Google News, the second result uses these forms of the words evacuate and Kazan.) The tweets below are quite varied, but are all relevant for the search results, such as the evacuation of a church after a fire, a bus evacuation, and a school evacuation:
If you are not familiar with Russian, it is difficult to know what phrases and search parameters to use to maximize results. So, it’s a good idea to enter general Russian terms and phrases into Google — such as “evacuation” and “Kazan” — and then copy the terms used most often. After we did this with the news results from Google, we found many more relevant results in our Twitter search, which — it’s important to recall — is not very accommodating the malleable nature of Russian nouns and verbs.
As you can see from these guides, researching in Russian — without knowing Russian — can be frustrating and time-consuming, but it is ultimately possible. You do not have to know how to read Russian, or even know the Cyrillic alphabet, to conduct Russian-language open-source research, but it certainly helps to have a basic familiarity with the language. After a while, you will start to recognize the letters and figure out how to read the alphabet, especially as certain names and cities repeat begin to reappear in your searches. In the meantime, this guide will help you leverage opaque acronyms, Twitter searches, and Google to assist your research efforts.
By Aric Toler, Global Voices