MULTEXT-East morphosyntactic specifications for Russian


The purpose of this document is to provide standardised word-level morphosyntactic specifications for Russian, i.e. a tagset and its definition in terms of attribute-value pairs.

These specifications follow the (draft) Version 4 of the multilingual MULTEXT-East specifications, which can be found on

The basic idea is that for each major category (Noun, Verb, Adjective, etc) the specifications define a fixed set of attributes (Case, Number, Gender, Animacy, etc), each with its set of values (e.g. masculine, feminine, neuter). Each category-dependent attribute is assigned a position, and each of its values a one letter code, so a complete morphosyntactic description of a word can be encoded by a MorphoSyntactic Descriptions (MSDs). For instance, the attribute-value specification Category=Noun, Type = common, Gender = masculine, Number = singular, Case = accusative, Animate = no corresponds to the MSD Ncmsan. In case a certain attribute is not appropriate for a given combination of features or for a particular lexical item, its code is the hyphen, e.g. Afpns-s, where case for Adjective qualificative positive neuter singular is undefined, when in the short form.

Serge Sharoff, Mikhail Kopotev, Tomaž Erjavec, Anna Feldman and Dagmar Divjak . Date: 2008-05-20
