OLiA annotation model for the morphosyntactic specifications of MULTEXT-East v. 4. (Erjavec 2010). Unless marked otherwise, all comments refer to this document.
Additionally, Qasemizadeh & Rahimi (2006), Dimitrova et al. (2009) and Derzhanski & Kotsyba (2009) were consulted for clarification. Email communication with Tomaž Erjavec, Serge Sharoff, Dan Tufis, Ivan A. Derzhanski, Natalia Kosyba, Csaba Oravecz and Hamidreza Kobdani represents the third source of information consulted for this ontology.
References:
Ivan Derzhanski, Natalia Kotsyba (2009), Towards a Consistent Morphological Tagset for Slavic Languages: Extending MULTEXT-East for Polish, Ukrainian and Belarusian, In: Proc. MONDILEX Third Open Workshop
Bratislava, Slovakia, 15–16 April, 2009, p. 9-26
Ludmila Dimitrova, Radovan Garabík, Daniela Majchráková (2009), Comparing Bulgarian and Slovak Multext-East morphology tagset, In: Proceedings of MONDILEX Second Open Workshop, Kyiv, Ukraine, 2–4 February, 2009, p. 38-46
Tomaž Erjavec (ed., 2010), MULTEXT-East Morphosyntactic Specifications Version 4. 2010-05-12, http://nl.ijs.si/ME/V4/msd/html/index.html
Behrang Qasemizadeh and Saeed Rahimi (2006), Persian in MULTEXT-East Framework, in T. Salakoski et al. (eds.): FinTAL 2006, LNAI 4139, pp. 541 – 551, 2006.
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#Abbreviation
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#AbessiveCase
Case="abessive" (Estonian)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#AblativeCase
Case="ablative" (Estonian, Hungarian)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#AccusativeCase
e.g., blishgnaha/blïžnji, Bohouv/böguw, Bohove/böguw, božjega/böžji, cach/kak, Christusa/krïštuš, coga/du, Coj/koj, colina/kolënu (sl-rozaj)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#ActiveVoice
Voice="active"
Macedonian has two types of (adjectival) participles exist: active and passive. Active corresponds to Macedonian L-form and passive to verbal adjective, neuter gender, singular. For example, nosel is encoded as VForm=Participle, Voice=Active, nosen as VForm=participle, Voice=Passive.
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#AdessiveCase
Case="adessive" (Estonian, Hungarian)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#AditiveCase
Case="aditive" (Estonian)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#Adjectival
Pronoun/Syntactic_Type="adjectival" (Slavic), Abbreviation/Syntactic_Type="adjectival"
Pronouns can be distinguished between having a (syntactically) nominal and (syntactically) adjectival function. All pronominal types except the demonstrative and possessive one can be nominal, and all except for the personal one can be adjectival.
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#AdjectivalAdverb
Type="adjectival" (Serbian, Macedonian, Bulgarian)
Bulgarian AdjectivalAdverbs have the same form as adjectives in Gender = neuter, Person = 3, Number = singular.
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#Adjective
Ukrainian adjectival participles are grouped with adjectives and are characterized by voice, quasi-tense and aspect.
Slovak adjectival nouns (gazdiná, hostinský) are classified as nouns. Sometimes the distinction between a noun and an adjective is not as clear as we want (obchodný cestujúci).
Slovak negative adjectives have negative lemma and negativeness is not marked otherwise.
Macedonian and Slovak negative adjectives are product of derivation, thus they belong to other types.
Macedonian adverbs like mnogu, malku, nekolku are also considered adjectives in cases they are used before nouns. The adverbs that come sometimes before nouns (mnogu, malku, nekolku) can have definiteness in their inflectional paradigm. That is why these adverbs are considered as adjectives as well.
In the modern Resian 'swöj' / own is an adjective, not a pronoun.
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#AdjectiveFormation
Adjective with feature "Formation"
The Formation attribute distinguishes a nominal (short) form from a so-called compound (long) form of an Adjective in Czech. The nominal form can be used in the predicative function only. It is specified for nominative and accusative Case only.
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#Adposition
Persian: Farsi has several prepositions but there is only one postposition '( 'راrâ). It is an overt marker for direct object. (Qasemizadeh and Saeed Rahimi 2006)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#AdpositionFormation
Adposition/Formation
Czech: A preposition can be contracted with a pronoun; such a preposition has Formation=c(ompound).
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#Adverb
Polish: Post-prepositional adjectives like (po) polsku are treated as adverbs.
Macedonian: Adverbs like mnogu, malku, nekolku are also considered adjectives, in cases they are used before nouns. The adverbs that come sometimes before nouns (mnogu, malku, nekolku) can have definiteness in their inflectional paradigm. That is why these adverbs are considered as adjectives as well.
Slovak: Particles form a separate part of speech category (see below) as is customary in Slovak grammars.
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#Adverbial
Pronoun/Syntactic_Type="adverbial" (Polish, Serbian, Russian, Ukrainian), Abbreviation/Syntactic_Type="adverbial"
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#AffirmativeParticle
Particle/Type="affirmative"
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#AgglutinantClitic
Clitic="agglutinant" (Verb, Pronoun: Polish)
Polish: The agglutination phenomenon in Polish is similar to Czech clitic_s for pronouns, but has a wider scope and can be found in more parts of speech. It is encoded as a more general "Clitic(y/n/a/d)" attribute and is specified, e.g., for the indicative VForm with Tense=pa(s)t, corresponding to "praet" flexeme in the IPIC to differentiate between forms like gniótł (clitic="n") and gniotł- (clitic="d"), where the latter not only demands a clitic but also has different form. The value "(a)gglutinant" indicates the clitic itself, e.g., -em in gniotłem. Values "y" and "n" are left to enable showing that a graphical word, i.e., delimited by white spaces, is a combination of a (d)emanding (or free) segment and an (a)gglutinant in case the word segmentation should be revised in the future. Prepositionality is encoded as Clitic with values "y(es)" for nią, niego etc., "n(o)" for ją, go etc., "a(gglutinant)" for -ń. Cf. the Clitic value "bound" for Slovene pronouns like zate which refers to the whole cluster, formally a combination of a preposition and a pronoun. This coding can be used for similar phenomena in Polish, e.g dlań (for him), given the word segmentation is revised towards a more trraditional one.
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#AllativeCase
Case="allative" (Estonian, Hungarian)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#AmbiguousCliticness
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#AmbiguousDefinitenessFeature
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#AmbivalentAspect
Aspect="ambivalent" (Verb: Slovak)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#Animacy
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#Animate
Animate="yes" (Slavic Noun/Pronoun http://nl.ijs.si/ME/V4/msd/html/msd.N.html; Czech verb)
Ukrainian: The feature "Animate" is used to differentiate between two accusative masculine forms.
Resian: Animacy can also be marked on neuter singular accusative Nouns. The feminine declension masculine noun has only one Ncmsa, that is marked as animate: oćo / father is Ncmsa--y.
Slovak distinguishes masculine animate (Animate=yes above) and masculine inanimate (Animate=no) Gender. Masculine inanimate nouns always have the same form in the nominative and accusative case, whereas masculine animate nouns have predominantly the same form in the genitive and accusative case. Masculine animate nouns and masculine inanimate nouns differ in accusative singular, nominative (vocative) and accusative plural only.
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#AoristTense
Tense="aorist"
In Bulgarian, there is a language specific Tense=aorist(a) value for the Tense attribute.
Past perfect tense “aorist” expresses a past action (event) carried out or completed in a given moment or
during a given period and finished before the state of speaking.
(Dimitrova et al. 2009)
Resian: The aorist is encountered sporadically in historical texts only. (MTE v4)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#ApproximateNumeral
Bulgarian has an additional ... Form=approx(a), used for approximate numerals (десетина /about a ten/, стотина /about a hundred/) (Dimitrova et al. 2009)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#Article
Determiner/Type="article" (Persian), Article (Romanian, Resian, Hungarian)
Note that for English, Definite and indefinite articles represented as values of "Determiner/Type"
Note that Determiner/Type="article" means that the token is in the intersection between Article and Determiner. Article and determiner are independent top-level concepts that may, however, overlap for some languages:
"For Romanian, the distribition of articles is fixed, while for determiners is not. Also, the determiners are Person marked while the articles are not." (Dan Tufis, email 2010/06/09)
Persian: There are different types of determiners namely demonstrative, indefinite, interrogative, exclamative, and article. As defined here, there is just one article in Farsi; i.e, '( 'ﻳﮏyek). It is homonym with ‘ ’ﻳﮏwhich is a number. (Qasemizadeh and Rahimi 2006)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#Aspect
Macedonian: In cases where a lemma (and its wordforms) can be both progressive and perfective, Aspects is given the value ?-?, in order to avoid excessive ambiguity in the lexicon.
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#AspectParticle
Particle/Type="aspect" (Romanian)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#AttributivePronoun
Pronoun/Referent_Type="attributive" (Bulgarian)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#AuxiliaryVerb
Type="auxiliary"
Czech: Auxiliary verbs (Type=a) include neither the verb "být" (see above), nor the modal verbs.
Slovak: Auxiliary verbs (Type=a) include neither the verb "byť" (see above), nor the modal verbs, and are limited to "mať" (MTE v4)
Resian: Verbs that can be of more than one Type are tagged according to their actual function, eg. byt / to be can be main, copula or auxiliary. (MTE v4)
Macedonian: We distinguish three types of verbs: main, auxiliary and modal. The word bi is considered as particle, rather than verb copula. (MTE v4)
Persian:
Future tense is made by the help of Auxiliary verbs. In order to make progressive form in Farsi, verbs are
inflected with the prefix '( 'ﻣﯽmī). Perfective forms of verbs are usually made using
auxiliary verbs '… ( 'ام، اﺳﺖam, ast, …). Passive form of the verbs in Farsi are made
by the help of Auxiliary verbs. Passive form of the verb is made of Past Participle +
Auxiliary verb '( 'ﺷﺪنšodan).
(Qasemizadeh and Rahimi 2006)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#BaseVerb
Type="base" (English)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#Biaspectual
Aspect="biaspectual" (Verb: Slovene, Russian, Ukranian; Adjective: Ukrainian)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#BothNumeral
Form="both" (Romanian)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#BoundClitic
Clitic="bound" (Slovene and Resian pronoun)
Clitic="bound" appears in Slovene and indicates in fact the whole cluster, e.g. "zame, pome", a combination of a preposition and a pronoun. So, ontologically, "bound" is rather ElementWith Clitic for Slovene.
(Natalia Kotsyba, email 2010/06/21)
In Resian, however, "bound" seems to be a CliticElement. At least, the Resian MSD index lists for nas/mï both Clitic=bound (Pp1-pa--b-n) and Clitic=no (Pp1-pa--n-n). This is really a problem, because the only proper generalization over both uses would be to specify it as being ambiguous between CliticElement (for Resian) and ElementWithClitic (for Slovene).
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#CardinalNumeral
Bulgarian/Slovak: Cardinal numerals signify a numerical (quantitative) property of objects: jeden
dom, dve ženy, tri knihy; един дом, две жени, три книги /one home, two women, three books/.
(Dimitrova et al. 2009)
Romanian: Traditional Romanian grammars usually distinguish seven numeral types, where five of them have specific forms and the other two are obtained by composition. The first group is made up by the following numeral types: cardinal (trei-three), ordinal (al treilea-the third), fractional (treime-one third), multiple (întreit-trine), collective (amândoi-both). The second group contains the numeral types which are composed by means of other parts of speech: distributive (câte trei-...each three...), adverbial (de trei ori-thrice) and again the collective numeral which also has compound forms (toţi trei-all three). Nonetheless, as the numerals of the second group have a weak syntactic cohesion, namely each composition element may be regarded as an element of the sentence, with its own grammatical function, these last numeral types are irrelevant for the morphosyntactic annotation. (MTE v4)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#Case
feature of Noun (http://nl.ijs.si/ME/V4/msd/html/msd.N.html) and Verb (Russian and Estonian, http://nl.ijs.si/ME/V4/msd/html/msd.V.html)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#CausalAdverb
Type="causal" (Hungarian)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#CausalisCase
Case="causalis" (Hungarian)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#Clitic
Clitic="yes" (Noun/Adjective: Romanian; Verb: Romanian, Polish, Serbian, Persian)
Slovak Pronoun: Type=reflexive ecompasses all reflexive pronouns (sa, sebe, si, svoj, seba) as well as "sa" in its role as the obligatory particle of reflexive verbs. Personal and possessive reflexives are further distinguished via the Referent_Type attribute. "sa" in all its roles will be marked as the reflexive personal clitic pronoun.
The Clitic attribute distinguishes clitical vs. nonclitical pronominal forms, e.g. "ti" vs. "tebe".
Polish Pronoun: Prepositionality is encoded as Clitic with values "y(es)" for nią, niego etc., "n(o)" for ją, go etc., "a(gglutinant)" for -ń. Cf. the Clitic value "bound" for Slovene pronouns like zate which refers to the whole cluster, formally a combination of a preposition and a pronoun. This coding can be used for similar phenomena in Polish, e.g dlań (for him), given the word segmentation is revised towards a more trraditional one.
Hungarian Adverb: The modifier -e question word (the only Hungarian clitic) is attached to the preceding word with a hyphen.
Romanian Verb, Noun, Adjective: The cliticization phenomenon in Romanian is not restricted to verb-pronoun relationship, but may also be observed with the (main) verb and the auxiliary, the noun or adjective with pronoun, with noun or adjective with copula, pronoun with auxiliary, preposition with (indefinite) article, numeral or (indefinite) pronoun, negative adverb with verb, auxiliary or pronoun, and some others (mainly created through the contracted forms of the verb "a fi"-to be). We restrict ourselves to considering only the graphically marked clicitizations. In such cases, the two, three or (sometimes) four constituents of a cliticized word-form are always separated by a hyphen. Omitting the hyphen in such cases is an unacceptable error in written Romanian.
Romanian Article: Note that the definite article has only enclitic forms, except for one proclitical form (lui + proper noun: lui Ion). The inflected forms of the foreign-origin words (mainly nouns) not fully assimilated, are usually written with a hyphen between the base-form and the inflectional ending. In our encoding, we classified these endings (which are supposed to be split by the segmenter) as clitic articles (clitic attribute is always "y") which can be either definite (type=f, "-istul") or indefinite (type=i, "ist") and are characterised by gender (gender=m, "ist"; gender=f, "istă"), number (number=s, "ist"; number=p, "işti") and case (case=r, "istul"; case=o, "istului").
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#CliticDefiniteDeterminer
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#CliticDeterminerType
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#CliticDistalDeterminer
Definiteness="distal" (Noun/Adjective/Pronoun: Macedonian)
For Macedonian, the definiteness attributes can take the values: non definite (no), generally definite (yes), definite at short visible distance (proximal), and definite at longer visible distance (distal).
Bulgarian: For singular masculine, there are two forms: a full article(f)[l.s.] and a short article(s)[l.s.]. The full article is used when a singular masculine form is the syntactic subject of the clause, otherwise a short one is used – a purely orthographic rule. The distinction of full vs. short is not made for feminine, neuter and plural forms, and we use just the yes(y) or no(n) to mark definiteness or respectively lack thereof. Therefore, the definiteness attribute can take overall 4 different values: indefinite(n), definitive(y), short article(s), full article(f) жени, жените /women, the women/ (Dimitrova et al. 2009)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#CliticElement
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#CliticIndefiniteDeterminer
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#CliticProximalDeterminer
Definiteness="proximal" (Noun/Adjective/Pronoun Macedonian) For Macedonian, the definiteness attributes can take the values: non definite (no), generally definite (yes), definite at short visible distance (proximal), and definite at longer visible distance (distal).
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#CliticSpecificDeterminer
Persian does have an article, but it marks specificity rather than definiteness. The Persian article is similar to the Balkan one (a clitic of pronominal origin that's written together with the word), except that it isn't exactly definite (you can even see it described as an indefinite article). (Ivan A. Derzhanski, emails 2010/06/18)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#CliticUnspecificDeterminer
Persian does have an article, but it marks specificity rather than definiteness. The Persian article is similar to the Balkan one (a clitic of pronominal origin that's written together with the word), except that it isn't exactly definite (you can even see it described as an indefinite article). (Ivan A. Derzhanski, emails 2010/06/18)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#Cliticness
feature "Clitic"
It may be possible that the attribute Clitic means either "hasClitic" (if applied to Noun) or "isClitic" (if applied to Article): This is similar to Case, which on Adpositions means "requiresCase" rathen than hasCase. Definitely something to think about; is it better to be formally correct or have small set of attributes? (Tomaz Erjavec, email 2010/06/09)
[Romanian] Clitic feature denotes: 1) a character ellision: i-am dat = îi am dat (î is deleted) - I gave him 2) insertion: ducându-mă = ducând+U+mă (U is inserted +for phonological reasons) - carrying myself 3) or both: mâncându-l = mâncând+U+_e_l (U is inserted and e is deleted)
Polish Pronoun: Prepositionality is encoded as Clitic with values "y(es)" for nią, niego etc., "n(o)" for ją, go etc., "a(gglutinant)" for -ń. Cf. the Clitic value "bound" for Slovene pronouns like zate which refers to the whole cluster, formally a combination of a preposition and a pronoun. This coding can be used for similar phenomena in Polish, e.g dlań (for him), given the word segmentation is revised towards a more trraditional one.
(MTE, v4.0)
Czech Pronoun: The Clitic attribute distinguishes clitical vs. nonclitical pronominal forms, e.g. "ti" vs. "tobě". (MTE, v4.0)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#Collective
Collective plurals are usually considered as derivation rather than an inflection, but modelled as a number feature in the
MTE schema of Resian (Slovene dialect in Italy).
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#CollectiveNumber
Collective plurals, though usually considered as derivation rather than an inflection, are modelled as a number feature in the
MTE schema of Resian (Slovene dialect in Italy).
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#CollectiveNumeral
Numeral/Type="collect"
Romanian: Traditional Romanian grammars usually distinguish seven numeral types, where five of them have specific forms and the other two are obtained by composition. The first group is made up by the following numeral types: cardinal (trei-three), ordinal (al treilea-the third), fractional (treime-one third), multiple (întreit-trine), collective (amândoi-both). The second group contains the numeral types which are composed by means of other parts of speech: distributive (câte trei-...each three...), adverbial (de trei ori-thrice) and again the collective numeral which also has compound forms (toţi trei-all three). Nonetheless, as the numerals of the second group have a weak syntactic cohesion, namely each composition element may be regarded as an element of the sentence, with its own grammatical function, these last numeral types are irrelevant for the morphosyntactic annotation.
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#Collocation
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#ComitativeCase
Case="komitative" (Estonian)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#CommonGender
http://nl.ijs.si/ME/V4/msd/html/msd.N.html (Nouns in Russian and Ukranian)
Russian: Gender=common is used for words such as судья, коллега, чукча, саша, убийца, etc. Ukrainian: The Gender value "common" is assigned to nouns that can combine with adjectives in either feminine or masculine,e.g. сирота or either neutral or masculine gender, e.g. Самоа.
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#CommonNoun
Type=Common
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#ComparativeDegree
e.g., působivějšími/působivý, rytířštějšími/rytířský, těžšími/těžký, vyššímu/vysoký, zajímavějšímu/zajímavý, závažnějšímu/závažný (cs)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#ComparativeParticle
Type="comparative" (Bulgarian)
Bulgarian: Type=comparative(c) is for particles used to create comparatives or superlatives (по, най) – Slovak comparatives are formed through a morphology suffix, naj- is written together with superlatives. (this could be considered just a difference in orthography). (Dimitrova et al. 2009)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#CompoundAdjective
Formation="compound" (Czech)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#CompoundAdposition
Adposition/Formation="compound"
Resian: Compound prepositions, like ta-na / in, at are tagged as such.
Romanian: In Romanian there is a distinct class of compound prepositions. Each of them forms a formal and semantic unit, although graphically they stay unfused, e.g. de la, pe la, de pe, etc.
Slovak: A preposition can be contracted with a pronoun; such a preposition has Formation=c(ompound).
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#CompoundConjunction
Conjunction/Formation="compound"
Resian: Compound conjunctions, like za wojo ki / because are tagged as such.
Romanian: As with prepositions, we can distinguish two kinds of conjunctions in Romanian: (1) simple conjunctions: e.g. şi,dar,deşi etc. (2) conjunctions formed periphrastically, with some word/phrase combined by a conjunction: din moment ce, fără să, fat,ă de cum etc.
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#CompoundInterjection
Interjection/Formation="compound"
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#CompoundParticle
Particle/Formation="compound"
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#Conditional
e.g., lennétek/lesz, továbbtaníttatnátok/továbbtaníttat, tudnálak/tud, tudnátok/tud, venné/vesz, veszélyeztetné/veszélyeztet, visszakapná/visszakap (hu)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#Conjunction
e.g., aby, abych/aby, abychom/aby, abys/aby, abyste/aby, kdyby, kdybych/kdyby, kdybychom/kdyby, kdybys/kdyby (cs)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#ConjunctionFormation
Conjunction/Formation Formation: refers to the graphical components: simple, i.e. consisting of one word; compound, i.e. consisting of more than one word.
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#CoordinatingConjunction
Type="coordinating"
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#CopulaVerb
Type="copula"
Czech: The verb "být" (E. "to be") in all its functions is characte- rized as Type=c (i.e. the copula), which clearly is an over- simplification because the verb has more meanings (auxiliary etc.). (MTE, v4.0)
Slovak: The verb "byť" (E. "to be") in all its functions is characte- rized as Type=c (i.e. the copula), which clearly is an over- simplification because the verb has more meanings (auxiliary etc.). (MTE, v4.0)
Resian: Verbs that can be of more than one Type are tagged according to their actual function, eg. byt / to be can be main, copula or auxiliary. (MTE, v4.0)
Macedonian: We distinguish three types of verbs: main, auxiliary and modal. The word bi is considered as particle, rather than verb copula. (MTE, v4.0) "bi" is funny, as, in contrast to other copula it doesn't inflect for person / number (or, equivalently, it is fully syncretic), so maybe this was the reason Macedonians put it in particles. (Tomaz Erjavec, email 2010/06/09)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#CorrelativeCoordinatingConjunction
Coord_Type="correlat"
In Romanian, there are three kinds of conjunctions depending on their usage: as such or together with other conjunctions or adverbs: (1) simple, between conjuncts: Ion ori Maria (John or Mary); (2) repetitive, before each conjunct: fie Ion fie Maria fie... (either John or Mary or...) (3) correlative, before a conjoined phrase, it requires specific coordinators between conjuncts: atât mama cât şi tata (both mother and father).
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#CountNumber
Number="count" (Nouns in Serbian, Macedonian, Bulgarian)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#Courtesy
feature "Courtesy"
Resian: The attribute Courtesy is only relevant for the 2nd person plural, where forms in '-ta' refer to a plural subject and '-të' to a singular subject. For Slovene this attribute is not used, even though the distinction is made in a similar manner. (MTE v4)
Persian: In some cases for courtesy, instead of the singular form of the verb, the plural one is used to refer to a singular subject. So we consider it as an attribute for Farsi Verbs. In fact, such attributes for Farsi are not found in traditional grammar books. (Qasemizadeh and Rahimi 2006)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#DativeCase
e.g., amerykańskiej/amerykański, amerykańskim/amerykański, awansującym/awansować, całemu/cały, celowi/cel, choremu/chory, ci/ty, czemu/co, czemuś/coś (pl)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#Definite
Definiteness="yes" (Noun/Adjective: Romanian, Macedonian, Bulgarian, Persian; Verb: Bulgarian, Hungarian; Pronoun: Resian, Macedonian, Bulgarian)
In Romanian, nouns can be marked for definiteness with the enclitic definite article. In noun-adjective construction, the definite article may attach enclitically to either adjectives or modified nouns (never to both of them). If present, the definite article attaches to the right of the first word in the sequence, e.g. Bunul om (The kind man) v.s. Omul bun. (The kind man)
(MTE v4)
For Macedonian, the definiteness attributes can take the values: non definite (no), generally definite (yes), definite at short visible distance (proximal), and definite at longer visible distance (distal).
(MTE v4)
Persian: Persian does have an article, but it marks specificity rather than definiteness. (Ivan A. Derzhanski, email 2010/06/18) According to Qasemizadeh & Rahimi's (2006) description of tokenization Definiteness of Nouns etc. thus refers to an orthographically non-separated definite (specifity-marking) article.
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#DefiniteArticle
Type="definite"
Hungarian: We have tree articles: a, az and egy. a and az are definite. These may not have number and case. The word 'az' may have but that is a pronoun in those cases. (MTE v4)
Resian: The definite article is 'te ta tö' and formally distinct from the demonstrative pronoun from which it derived: 'jte jta jtö'. (MTE v4)
The Persian article marks specificity rather than definiteness. (Ivan A. Derzhanski, email 2010/06/18)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#Definiteness
corresponds to the definite and indefinite article in English, which is expressed in the Slavic languages by suffixes. For Bulgarian singular masculine there are two forms: full article and short article (full is used when a sing.masc. form is the syntactic subject of the clause, otherwise short article is used). The distinction full vs. short is not made for feminine, neuter and plural forms. Definiteness is also used in Romanian.
(MTE v4)
(Bulgarian) Definiteness attribute: One of the most important grammatical characteristics of the new Bulgarian language which sets it apart from the rest of the Slavic languages is the existence of a definite article. The definite article is a morphological indicator of the grammatical category determination (definiteness). The definite article is not a particle (particles are a separate category of words – parts-of-speech, while the article is not a separate word), nor is it a simple suffix, but a meaningful compound part of the word. It is a word-forming
morpheme, which is placed at the end of words in order to express definiteness, familiarity, acquaintance
(Bulgarian Grammar, 1993). In Bulgarian, nouns, adjectives, numerals, and full-forms of the possessive
pronouns and participles can acquire an article.
(Dimitrova et al. 2009)
Bulgarian: For singular masculine, there are two forms: a full article(f)[l.s.] and a short article(s)[l.s.]. The full article is used when a singular masculine form is the syntactic subject of the clause, otherwise a short one is used – a purely orthographic rule. The distinction of full vs. short is not made for feminine, neuter and plural forms, and we use just the yes(y) or no(n) to mark definiteness or respectively lack thereof. Therefore, the definiteness attribute can take overall 4 different values: indefinite(n), definitive(y), short article(s), full article(f).
(Dimitrova et al. 2009)
Polish:
The IPIC flexeme winien and predicatives like rad are treated as short adjectives—Definiteness="short-art".
The Vocalicity of (a)gglutinated forms like -em vs -m is mapped on the Definiteness attribute with its values "(f)ull-art" and "(s)hort-art" respectively, meaning "full form" and "short form". The terms are very artificial, but this category is used due to the similarity of the phenomenon.
(MTE v4)
Macedonian: The adverbs that come sometimes before nouns (mnogu, malku, nekolku) can have definiteness in their inflectional paradigm. That is why these adverbs are considered as adjectives as well.
(MTE v4)
Romanian: By virtue of their noun or adjective value, some numerals may take the enclitic article (prim/primul - first/the first). Consequently for the Romanian, definiteness attribute helps distinguish the enclitic forms from the other forms.
(MTE v4)
One of MTE v.3’s most perplexing choices is that it uses the same binary feature Definiteness of the part of speech Verb to indicate, in Bulgarian, that a participle bears a definite article (говорилите ‘the ones who talked’), and in Hungarian, that a finite form of a transitive verb has a definite 3rd person direct object (tanulom ‘I learn it’). Thus two totally dissimilar (not to mention unrelated) phenomena are handled alike merely because their names in the respective grammatical traditions happen to mean the same. In MTE v.4 the tagset for Persian encodes izafet as Case=genitive (i.e., practically the opposite!) in an effort to avoid introducing a language-specific feature.
(Derzhanski and Kotsyba 2009)
Hungarian Definiteness (of verbs): In simple terms, it means that the verb takes a definite object, which is reflected in the type of verb conjugation. Eg. in Hungarian there will be two forms of the verb 'see' here
I see you_sg/you_pl
(Csaba Oravecz, email 2010/06/15)Persian: In Farsi, Nouns are inflected for number and Definiteness. ... Farsi adjectives are inflected for degree and definiteness. (Qasemizadeh and Rahimi 2006)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#Degree
Hungarian: Some adverbs may have degree, but these do not formulate a special class. Presently we cannot give a criteria for this.
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#DelativeCase
Case="delative" (Hungarian)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#DemandingClitic
Clitic="demanding" (Verb: Polish)
An element that contains a clitic which is, however, represented as a separate token.
Polish: Particles were extracted from the IPIC particle-adverbs category manually along with adverbs, pronouns and interjections and a few conjunctions. The Clitic attribute enables differentiating particles that are agglutinated to non-particles (value= "a"), e.g., by, że. The value "y" labels a composite particle such as niechby when treated as one word; alternatively it may be encoded as a aequence of two particles, the optionally demanding niech with Clitic="d" and the agglutinant by with Clitic="a". (MTE v4)
This can be a subclass of ElementWithoutClitic. They are default though and won't be encoded in most cases. We only use them in some cases for Polish verbs. (Natalia Kotsyba, email 2010/06/21)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#DemonstrativeArticle
Type="demonstrative"
Romanian: Although it presents only a few items, the article in Romanian has four types, unlike in most of the European languages. Beside the two recommended types: definite and indefinite which have the generally known semantic value, Romanian uses two additional types of articles, which are semantically subordinated to the definite article but which have special forms and meanings: (1) the possessive article (also called genitival article) is an element in the structure of the possessive pronoun, of the ordinal numeral (e.g. al meu (mine) and al treilea (the third)), and of the indefinite genitive forms of the nouns (e.g. capitol al cărţii (chapter of the book)). (2) the demonstrative article links a definite noun to its determinants, links a numeral or an adjective to a noun, and it is a constituent part of the relative superlative (e.g. fata cea mare (the elder girl), cel lenes, (the lazy), respectively prietenul cel mai bun (the best friend)).
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#DemonstrativeDeterminer
Determiner/Type="demonstrative" (English, Romanian, Persian)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#DemonstrativePronoun
Pronoun/Type="demonstrative"
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#DemonstrativeQuantifier
Numeral/Class="demonstrative" (Czech, Slovak)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#DeterminalPronoun
Pronoun/Type="determinal" (Estonian)
"in Estonian this label is used for the emphatic/reflexive pronouns ise, end(a) `(one)self'."
(Ivan A. Derzhanski, email 2010/06/15)
"The Estonian intensifier ise is formally identical with the Estonian reflexive pronoun and is not marked for person but for case and number."
Insa Gülzow (2006), The acquisition of intensifiers: Emphatic reflexives in English and German child language, Mouton de Gruyter, Berlin, p. 258
Heiki-Jaan Kaalep: "When I created the MTE tables, I used "Eesti keele grammatika II. Morfoloogia. Arv- ja asesõna" Tartu, TÜ, 1981 (Grammar of Estonian II. Morphology. Numerals and pronouns. Tartu, Univ. of Tartu, 1981). The types of pronouns in MTE tables originate from p. 35 of this booklet. However, grammar books of the current century give a different list of types for pronouns, and group the pronouns in somewhat different ways. There is no consensus about the issue at the moment, and no debate on it also, as far as I can tell. Estonian determinal pronouns are not similar to the English determiners. However, all the determinal pronouns could be classified into other types (if we wanted to...), so we could get rid of this category altogether." (Heiki-Jaan Kaalep, email 2010/06/21)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#Determiner
e.g., 'Er/her, 'is/his, a, an/a, her, his, its, my, our (en)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#DigitNumeral
Form="digit"
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#Diminuitive
Degree="diminuitive" (Adjective: Resian)
Resian: The value 'diminutive' for Degree is relevant for derivated adjectives that end with the suffix '-ić'.
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#DirectCase
Case="direct" (Romanian)
In the Romanian case system the value 'direct' conflates 'nominative' and 'accusative'.
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#DistributiveCase
Case="distributive" (Hungarian)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#DualNumber
Czech: The dual Number manifests itself only in the instrumental Case of several Nouns denoting dual parts of the human body.
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#DualQuantifier
Numeral/Class="definite2" (Czech)
cf. Slovak: Among the definite numbers there are four subclasses (definite1, definite2, definite34, definite) which differ in their syntactic distribution and contain the following numerals: {1}, {2,3,4}, {5,6,...}
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#ElativeCase
Case="elative" (Estonian, Hungarian)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#ElativeDegree
Degree="elative" (Adjective: Resian, Serbian, Macedonian)
In Semitic languages, the “adjective of superiority.” In some languages such as Arabic, the concepts of comparative and superlative degree of an adjective are merged into a single form, the elative. How this form is understood or translated depends upon context and definiteness. In the absence of comparison, the elative conveys the notion of “greatest”, “supreme.”
The elative of كبير (kabí:r, "big") is أكبر (’ákbar, “bigger/biggest”, “greater/greatest”).
(http://en.wiktionary.org/wiki/elative)
In Slavic languages, as well, it is pretty standard. I do agree with the definition though, that "the elative conveys the notion of “greatest”, “supreme.”" So, "lep" is beautiful "prelep" is very (or supremely) beautiful; I guess the "pre-" prefix could be roughly translated as "over-". Used in Resian, Serbian, Macedonian; formerly in Slovenian too, but we banished it, as even "ordinary" degrees are borderline inflection / derivation, but, I think, elative is is definitely not inflection. (Tomaž Erjavec, email 2010/06/21)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#ElementWithClitic
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#ElementWithoutClitic
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#EmphaticDeterminer
Determiner/Type="emphatic" (Romanian)
In Romanian, there are specific forms for the so-called emphatic determiner, which may accompany both a noun and a personal pronoun: fata însăşi (the girl herself), also ea însăşi (she herself).
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#EmphaticPronoun
Pronoun/Type="emphatic" (Ukrainian)
Ukrainian: The emp(h)atic Type of Pronoun is used for pronoun forms ні"кому, ні"чому, ні"чим, ні"кого, etc., with complex meanings like "there is nobody/nothing (to do sth/to use for doing sth, etc.)". Orthographically these are identical forms of negative nominal pronouns ніхто, ніщо "nobody, nothing" in oblique cases, however, with differing accent. They are referred to as either separate pronoun lexemes or predicatives in grammars. All Ukrainian emphatic pronoun forms include negation.
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#EssiveCase
Case="essive" (Hungarian, Estonian)
In Estonian the essive case means such things as (I played golf)
as a student',
(I worked) as a bartender', (you look) tired',
(he's very good) as a dancing partner', `(we parted) as friends'.
This doesn't sound like the definition you quoted, but is similar
(though not identical) to the meaning of the Hungarian form.
(Ivan A. Derzhanski, email 2010/06/15)
Hungarian has two essive cases, essive-formalis (formatives, e.g., emberként "as people") and essive-modalis (essivus-formalis, e.g., emberül from ember "people")
(Nose 2003, p. 108)
The essive-modal case in Hungarian language can express the state, capacity, task in which somebody is or which somebody has (Essive case, e.g. "as a reward", "for example"), or the manner in which an action is carried out, an event happens, or the language which somebody knows (Modal case, e.g. "sloppily", "unexpectedly", "speak German"). An example of this would be in the sentence "Beszélek magyarul." (I speak Hungarian.) The sentence denotes the ability of being able to speak the Hungarian language. According to vowel harmony rules, ul becomes ül in cases such as "Beszélek németül." (I speak German.) because the word for "German", német is composed completely of median and/or frontal vowels. (http://en.wikipedia.org/wiki/Essive-modal_case)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#EssiveFormalCase
Case="essive-formal" (Hungarian)
e.g., Hungarian 'katonaként' -> [serves] as a soldier. (Csaba Oravecz, email 2010/06/15)
The Hungarian "formativus, or essivus-formalis `-ként' ... usually expresses a position, task and manner of the person or the thing." (Nose 2003)
"Haspelmath & Buchholz (1998:321) explained the function of the essive case as ``role phrases''. Role phrases represent the role of the function in which a participant appears. They regard the role phrases as adverbial."
(Nose 2003, p. 117)
In the Hungarian language this case combines the Essive case and the Formal case, and it can express the position, task, state (e.g. "as a tourist"), or the manner (e.g. "like a hunted animal"). The status of the suffix -ként in the declension system is disputed for several reasons. First, in general, Hungarian case suffixes are absolute word-final, while -ként permits further suffixation by the locative suffix -i. Second, most Hungarian case endings participate in vowel harmony, while -ként does not. For these reasons, many modern analyses of the Hungarian case system, starting with László Antal's "A magyar esetrendszer" (1961) do not consider the essive/formal to be a case.
(http://en.wikipedia.org/wiki/Essive-formal_case)
cf. Masahiko Nose (2003), Adverbial Usage of the Hungarian Essive Case
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#ExclamativeDeterminer
e.g., چهقدر چه (fa)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#ExclamativePronoun
Pronoun/Type="exclamative"
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#ExistentialThere
Pronoun/Type="ex-there" (English)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#FactiveCase
Case="factive" (Hungarian)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#FeminineGender
e.g., akceptowanymi/akceptować, akredytujące/akredytować, akredytujących/akredytować, akredytującą/akredytować, cytowanych/cytować, cztery/cztery, czwartej/czwarty, czwartą/czwarty, debiutująca/debiutować (pl)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#FirstPerson
e.g., آمديم/آمد, باشم/باش, باشيم/باش, بتوانم/توان, بتوانيم/توان, بخواهيم/خواه, بدانم/دان بخوابونم/خوابان ببينم/بين, بدهيم/ده بخوانيم/خوان ببينيم/بين, بريم/رو برويم/رو باشيم/باش (fa)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#FirstSgSecondSg
Definiteness="1s2s" (Verb: Hungarian)
Hungarian: 1s2s is a special form for definitness, in which the speaker's person is first singular (I) and the target of the transitivity is second singular (you). (MTE v4)
Hungarian Definiteness (of verbs): In simple terms, it means that the verb takes a definite object, which is reflected in the type of verb conjugation. Eg. in Hungarian there will be two forms of the verb 'see' here
I see you_sg/you_pl
(Csaba Oravecz, email 2010/06/15)IRI: http://nl.ijs.si/ME/owl/multext-east.owl#Foreign
Residual/Type="foreign" For Slovene the Type attribute has been introduced on Residual, which distinguishes the values of "foreign", to mark a words in a strech of foreign language text, "typo", a mis-typed word, and "program", where the tokenisation program made a mistake. The second, and esp. the third value are useful for hand-annotation of corpora.
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#FormalCase
(from the discussion of [Hungarian] EssiveFormalCase)
"formal' in
essive-formal' is not an indication of register: there is another form, which in some descriptions is simply called formal', with the affix _-képp(en)_ and a similar meaning (
in
the form of ...', they probably meant when they came up with the term).
The line between a case ending, an adverb formative and a postposition
is a thin one in Hungarian."
(Ivan A. Derzhanski, email 2010/06/15)
http://en.wikipedia.org/wiki/Essive-formal_case (2010/06/15): "In the Hungarian language this case combines the Essive case and the Formal case, and it can express the position, task, state (e.g. "as a tourist"), or the manner (e.g. "like a hunted animal")."
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#Formation
Formation: refers to the graphical components: simple, i.e. consisting of one word; compound, i.e. consisting of more than one word.
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#FractalNumeral
Romanian: Traditional Romanian grammars usually distinguish seven numeral types, where five of them have specific forms and the other two are obtained by composition. The first group is made up by the following numeral types: cardinal (trei-three), ordinal (al treilea-the third), fractional (treime-one third), multiple (întreit-trine), collective (amândoi-both). The second group contains the numeral types which are composed by means of other parts of speech: distributive (câte trei-...each three...), adverbial (de trei ori-thrice) and again the collective numeral which also has compound forms (toţi trei-all three). Nonetheless, as the numerals of the second group have a weak syntactic cohesion, namely each composition element may be regarded as an element of the sentence, with its own grammatical function, these last numeral types are irrelevant for the morphosyntactic annotation.
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#FullArticle
Definiteness="full-art" (Noun: Bulgarian; Verb: Polish, Russian, Bulgarian; Adjective: Polish, Russian, Ukrainian, Bulgarian; Pronoun: Polish, Bulgarian)
Bulgarian:
For singular masculine, there are two forms: a full article(f)[l.s.] and a short article(s)[l.s.]. The full
article is used when a singular masculine form is the syntactic subject of the clause, otherwise a short one is
used – a purely orthographic rule. The distinction of full vs. short is not made for feminine, neuter and plural
forms, and we use just the yes(y) or no(n) to mark definiteness or respectively lack thereof. Therefore, the
definiteness attribute can take overall 4 different values: indefinite(n), definitive(y), short article(s), full
article(f)
e.g., мъж, мъжа, мъжът /a man, the man[short], the man [full]/
(Dimitrova et al. 2009)
Polish: The IPIC flexeme winien and predicatives like rad are treated as short adjectives—Definiteness="short-art". The Vocalicity of (a)gglutinated forms like -em vs -m is mapped on the Definiteness attribute with its values "(f)ull-art" and "(s)hort-art" respectively, meaning "full form" and "short form". The terms are very artificial, but this category is used due to the similarity of the phenomenon.
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#FutureParticle
Particle/Type="future" (Romanian)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#FutureTense
Tense="future"
Czech/Slovak: Normally, Verbs form the future Tense periphrastically by auxiliary "být" (E. "to be") plus infinitive of the main Verb. In addition to the copula, there are, however, some Verbs which form future Tense non-periphrastically, i.e. synthetically (Verbs of motion). Such verbal forms are marked as Tense=f.
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#Gender
Ukrainian:
Ukrainian pluralia tantum nouns are not encoded directly but can be identified by the absence of a value of Gender ("-").
Polish:
Macedonian: The attributes Owner_Number and Owner_Gender are used for the pronoun negov. For other pronouns, ? is used.
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#GeneralAdjective
Type="general" (Slovene)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#GeneralAdverb
Type="general"
Romanian: The distinction proposed here considers the principal syntactic properties of the adverbs. For Romanian, the general type includes most of the pronominal adverbs (demonstrative: aici (here), indefinite: oriunde (anywhere)). As argued before for pronouns and determiners, a distinct negative value is needed for adverbs as well (nicăieri - nowhere, niciodată - never). The particle type covers those adverbs which can dislocate verbal compound forms (ex. Ea a tot cântat -- She has ever sung) or mark degrees (ex. circa (about), foarte (very), prea (too)). Such adverbs are cam, mai, prea, şi, tot, foarte etc. A useful distinction in Romanian considers the adverbs which can have predicative role, that is they can govern a subordinate sentence (ex. Fireşte că o ştiu -- Certainly I know it). Here (for uniformity within a multilingual environment), they are squeezed into the modifier class. No formal distinction is made between the interrogative adverbs and the relative ones. The "portmanteau" type of adverb was introduced to cover some few words which can be both adverbs and conjunctions (with adverbial reading more frequent). This was necessary for tagging purposes.
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#GeneralDeterminer
Determiner/Type="general" (English)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#GeneralParticle
Type="general" (Bulgarian)
Bulgarian: In the Bulgarian MTE tagset, particles are characterised by the Type attribute. Type attribute is one of negative, general, comparative, verbal, interrogative, modal. ... Type=general(g) is for all the other, non-specialised particles. (Dimitrova et al. 2009)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#GeneralPronoun
Pronoun/Type="general" (English, Slavic)
Resian: The taxonomy for Type is in concordance with the grammatical analysis of Resian, be it that 'other' here appears as the value 'general'.
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#GenitiveCase
e.g., 0, Аахена/Аахен Абая/Абай Абакана/Абакан Абакану/Абакан Абіджана/Абіджан Абовяна/Абовян Авентина/Авентин Адамстауна/Адамстаун Адена/Аден Аджаристану/Аджаристан Адлера/Адлер Адоніса/Адоніс Адріанополя/Адріанополь Азенкура/Азенкур Азербайджану/Азербайджан Азова/Азов Акермана/Акерман Акмолінська/Акмолінськ Акрополя/Акрополь Актюбінська/Актюбінськ, Або/= Абруццо/= Абу-Дабі/= Адріатичного моря/Адріатичне море Азовського моря/Азовське море Айдахо/= Ай-Петрі/= Аконкагуа/= Актау/= Алеппо/= Аліканте/= Алмати/= Альбі/= Андаманського моря/Андаманське море Анжу/= Ані/= Антананаріву/= Антигуа/= Апіа/= Апостолового/Апостолове, Абіссінії/Абіссінія Абуджі/Абуджа Абхазії/Абхазія Авдіївки/Авдіївка Австралії/Австралія Австрії/Австрія Австро-Угорщини/Австро-Угорщина Аддис-Абеби/Аддис-Абеба Аджарії/Аджарія Адигеї/Адигея Адріатики/Адріатика Азії/Азія Айови/Айова Аквітанії/Аквітанія Аккри/Аккра Ак-Мечеті/Ак-Мечеть Акмоли/Акмола акрилової кислоти/акрилова кислота Алабами/Алабама Албанії/Албанія, Азорських островів/Азорські острови Аландських островів/Аландські острови Алеутських островів/Алеутські острови Андаманських островів/Андаманські острови Антильських островів/Антильські острови Багамських островів/Багамські острови Балеарських островів/Балеарські острови Бельбеків/Бельбек Бермудських островів/Бермудські острови Бугів/Буг Великоднів/Великдень Великих Сорочинців/Великі Сорочинці Віргінських островів/Віргінські острови Гавайських островів/Гавайські острови Гебридських островів/Гебридські острови Драконових гір/Драконові гори ЖЕКів/ЖЕК Зеленого Мису островів/Зеленого Мису острови Змієвих валів/Змієві вали Зондських островів/Зондські острови, Альп/Альпи Анд/Анди Апеннін/Апенніни Аппалачів/Аппалачі Арденн/Арденни Афін/Афіни Багам/Багами Балкан/Балкани Байдарських воріт/Байдарські ворота Барановичів/Барановичі Бендер/Бендери Березняків/Березняки Бермуд/Бермуди Бескидів/Бескиди Біличів/Біличі Близнюків/Близнюки Богородчан/Богородчани Боровичів/Боровичі Бортничів/Бортничі Броварів/Бровари, Америк/Америка Андалуських гір/Андалуські гори Атлаських гір/Атлаські гори Бистриць/Бистриця Держдум/Держдума Жовтих Вод/Жовті Води Збройних Сил України/Збройні сили України Індій/Індія Інтернет-газет/Інтернет-газета Караваєвих дач/Караваєві дачі Магелланових хмар/Магелланові хмари Македоній/Македонія Мате Залка/= Пасх/Пасха Пилипівок/Пилипівка Рад/Рада Родезій/Родезія Інтернет-технологій/Інтернет-технологія, Антропова/Антропов Абая/Абай Абакума/Абакум Абакумовича/Абакумович Абалкіна/Абалкін Абашидзе/= Аббаса/Аббас Абдули/Абдула Абдуллоджанова/Абдуллоджанов Абеля/Абель Абовяна/Абовян Абрама/Абрам Абрамовича/Абрамович Абрамовича/Абрамович Абрамчука/Абрамчук Абуладзе/= Авакума/Авакум Авакумовича/Авакумович Августа/Август Августимова/Августимов, Антропових/Антропов Абакумів/Абакум Абакумовичів/Абакумович Абалкіних/Абалкін Абашидзе/= Аббасів/Аббас Абдулів/Абдула Абдуллоджанових/Абдуллоджанов Абелів/Абель Абовянів/Абовян Абрамів/Абрам Абрамовичів/Абрамович Абрамовичів/Абрамович Абрамчуків/Абрамчук Абуладзе/= Авакумів/Авакум Авакумовичів/Авакумович Августів/Август Августимових/Августимов Августинів/Августин (uk)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#Gerund
VForm="Gerund". For Polish Nouns also Type="Gerund", hence "VerbForm and (Noun or VerbForm)" (http://nl.ijs.si/ME/V4/msd/html/msd.N.html).
Russian: Verb/Type="gerund"
Russian specification distinguish Type=Gerund, and hence introduce various nominative properties on Verb, i.e. Definiteness, Case.
Polish Noun/Type="gerund":
cf. Czech: Verbal nouns are classified as Nouns.
Macedonian: Verb forms gerund and adverbial participle are taken to be separate (noun and adverb) lemmas.
Romanian: The following features are pertinent to those moods which permit an adjectival use, i.e. participle and gerund. However, the adjectival use of gerund is extremely rare (o mână tremurndă - a shaking hand) and therefore gender and number apply mainly for the participle.
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#GerundOrAdverbialParticiple
The problem is that the English term gerund is ambiguous: with respect
to Latin, in whose grammatical tradition it originates, it refers to a
deverbal noun, and is needed in this function for Polish as well; in
descriptions of some other languages, however, it has been used for an
adverbial participle. The two meanings have nothing in common, except
that the English ing-form can translate both.
(Ivan A Derzhanski, email 2010/06/09)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#GerundProper
Currently gerunds (deverbal nouns) are encoded as common nouns. Since they are very frequent in Polish,
it seems expedient to add a type for them, with the additional features Aspect and Negation relevant only
to gerunds. The latter would enable celebrowanie ‘celebrating’ and niecelebrowanie ‘not celebrating’ to
count as forms of the same lexeme
(Derzhanski and Kotsyba 2009)
gerund is ambiguous: with respect
to Latin, in whose grammatical tradition it originates, it refers to a
deverbal noun, and is needed in this function for Polish as well; in
descriptions of some other languages, however, it has been used for an
adverbial participle. The two meanings have nothing in common, except
that the English ing-form can translate both.
(Ivan A Derzhanski, email 2010/06/09)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#Human
Human="yes"
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#Humanness
feature "Human"
Polish:
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#IllativeCase
Case="illative" (Estonian, Hungarian)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#Imperative
e.g., buď/být, chtěj/chtít, dejme/dát, dejte/dát, nebraňme/bránit, nebudiž/být, nebuďme/budit, nebuďme/být, nevezme/vézt (cs)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#ImperfectTense
Tense="imperfect" (Romanian, Croatian, Serbian, Macedonian, Bulgarian, Estonian)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#Impersonal
VForm="impersonal" (Polish, Ukrainian)
Ukrainian: The impersonal VForm (o) is characterized by the ending -то/-но. It exists in other Slavic languages as well, although in most of them it coincides with the neutral form of the passive adjectival participle and is classified as such. In Ukrainian, as well as in Polish, the attributive form is different from the predicative one, cf. in Ukrainian писане правило (a written rule) vs писано правило (a rule was/is written).
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#Inanimate
Animate="no" (Slavic Noun/Pronoun http://nl.ijs.si/ME/V4/msd/html/msd.N.html; Czech verb)
Ukrainian: The feature "Animate" is used to differentiate between two accusative masculine forms.
Slovak distinguishes masculine animate (Animate=yes above) and masculine inanimate (Animate=no) Gender. Masculine inanimate nouns always have the same form in the nominative and accusative case, whereas masculine animate nouns have predominantly the same form in the genitive and accusative case. Masculine animate nouns and masculine inanimate nouns differ in accusative singular, nominative (vocative) and accusative plural only.
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#Indefinite
Definiteness="no" (Noun/Adjective: Romanian, Macedonian, Bulgarian, Persian; Verb: Bulgarian, Hungarian; Pronoun: Resian, Macedonian, Bulgarian)
For Macedonian, the definiteness attributes can take the values: non definite (no), generally definite (yes), definite at short visible distance (proximal), and definite at longer visible distance (distal). (MTE v4)
Bulgarian:
For singular masculine, there are two forms: a full article(f)[l.s.] and a short article(s)[l.s.]. The full
article is used when a singular masculine form is the syntactic subject of the clause, otherwise a short one is
used – a purely orthographic rule. The distinction of full vs. short is not made for feminine, neuter and plural
forms, and we use just the yes(y) or no(n) to mark definiteness or respectively lack thereof. Therefore, the
definiteness attribute can take overall 4 different values: indefinite(n), definitive(y), short article(s), full
article(f)
e.g., мъж, мъжа, мъжът /a man, the man[short], the man [full]/
(Dimitrova et al. 2009)
Persian: Persian does have an article, but it marks specificity rather than definiteness. (Ivan A. Derzhanski, email 2010/06/18) According to Qasemizadeh & Rahimi's (2006) description of tokenization Definiteness of Nouns etc. thus refers to an orthographically non-separated definite (specifity-marking) article.
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#IndefiniteAdjective
Type="indefinite"
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#IndefiniteArticle
Type="indefinite"
Hungarian: We have tree articles: a, az and egy. egy is indefinite. These may not have number and case.
Resian: The indefinite pronoun is 'din na nö' and formally distinct from the numeral from which it derived: 'dyn dnä dnö'. (under Article)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#IndefiniteDeterminer
Determiner/Type="indefinite" (English, Romanian, Persian)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#IndefinitePronoun
Pronoun/Type="indefinite"
Romanian: In Romanian it is worth differentiating the negative pronoun from other indefinite pronouns: a negative pronoun cannot be an argument for a verb unless the verb itself is negated too (e.g. Nu am văzut pe nimeni / *Am văzut pe nimeni).
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#IndefiniteQuantifier
Numeral/Class="indefinite" (Czech, Slovak)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#Indicative
e.g., -, elárultalak/elárul, figyelmeztettelek/figyelmeztet, megismertelek/megismer, uralhatjátok/uralhat, visszaemlékszel/visszaemlékszik, visszarettentek/visszaretten, visszatérsz/visszatér, visszatértek/visszatér (hu)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#InessiveCase
Case="inessive" (Hungarian, Estonian)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#Infinitive
e.g., bejt/být, být, bývat, dostat, mít, nebýt/být, nemít/mít, nezneužívat/zneužívat, neztratit/ztratit (cs)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#InfinitiveParticle
Particle/Type="infinitive" (Romanian)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#InitialCoordinatingConjunction
Coord_Type="initial" (English)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#InstrumentalCase
e.g., desaterým/desaterý, devatenáctým/devatenáctý, druhou/druhý, druhým/druhý, druhými/druhý, dvojí, dvěma/dva, jakou/jaký, jakým/jaký (cs)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#Interjection
e.g., ach/ach, ho/ho, och/och (pl)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#InterjectionFormation
Formation: refers to the graphical components: simple, i.e. consisting of one word; compound, i.e. consisting of more than one word.
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#InterrogativeAdverb
Type="interrogative"
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#InterrogativeDeterminer
Determiner/Type="interrogative" (Persian)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#InterrogativeOrRelativeAdverb
Type="int-rel" (Romanian)
Romanian: The distinction proposed here considers the principal syntactic properties of the adverbs. For Romanian, the general type includes most of the pronominal adverbs (demonstrative: aici (here), indefinite: oriunde (anywhere)). As argued before for pronouns and determiners, a distinct negative value is needed for adverbs as well (nicăieri - nowhere, niciodată - never). The particle type covers those adverbs which can dislocate verbal compound forms (ex. Ea a tot cântat -- She has ever sung) or mark degrees (ex. circa (about), foarte (very), prea (too)). Such adverbs are cam, mai, prea, şi, tot, foarte etc. A useful distinction in Romanian considers the adverbs which can have predicative role, that is they can govern a subordinate sentence (ex. Fireşte că o ştiu -- Certainly I know it). Here (for uniformity within a multilingual environment), they are squeezed into the modifier class. No formal distinction is made between the interrogative adverbs and the relative ones. The "portmanteau" type of adverb was introduced to cover some few words which can be both adverbs and conjunctions (with adverbial reading more frequent). This was necessary for tagging purposes.
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#InterrogativeOrRelativeDeterminer
Determiner/Type="int-rel" (Romanian)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#InterrogativeOrRelativePronoun
Pronoun/Type="int-rel" (Romanian)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#InterrogativeParticle
Type="interrogative"
Bulgarian: Type=interrogative(i) are particles used to form yes/no-questions or exclamations (ли, дали, нали, нима, мигар) – this type of particles is not present in Slovak at all. (Dimitrova et al. 2009)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#InterrogativePronoun
Pronoun/Type="interrogative"
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#InterrogativeQuantifier
Numeral/Class="interrogative" (Czech, Slovak)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#Intransitive
Transitive="no" (Persian)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#LetterNumeral
Form="letter"
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#LightVerb
Type="light" (Persian)
Persian has a limited number of just over a hundred simple
verbs. The vast majority of verbal notions are expressed through
compound verbs using one of a limited set of light verbs (which also
occur as simple verbs).
(Neiloufar Famili, abstract of a talk given 2010/06/14)<br/>
Light verbs in Farsi are used to make a compound verb structure. Compound verb structure consists of one or more preverbal elements which could be a noun, adjective, or a prepositional phrase, followed by a Light verb. The number of Light verbs is limited. The elements of a compound verb construction can be separated by other lexical elements such as the object of the verbal construction or an adjective, adverb, etc. Therefore our suggestion is to analyze compound verb construction only at the syntactic level. We should also note that Light verbs are homographic with Main verbs. (Qasemizadeh and Rahimi 2006)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#LocativeCase
e.g., belim/beo, Bratstvu/Bratstvo, dvojim/dvoje, Hajd, jednoj/jedan, jednom/jedan, jednome/jedan, jednomu/jedan, Jevrejkama/Jevrejka (sr)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#MFormNumeral
Form="m-form" (Bulgarian)
Bulgarian has an additional Form=m_form(m), used only for people, formed with suffix -(и)ма: двама, трима, петима /two(people), three(people), five(people)/ and Form=approx(a), used for approximate numerals (десетина /about a ten/, стотина /about a hundred/)
(Dimitrova et al. 2009)
[S]pecial form of cardinal numbers for persons of masculine gender for two',
three', four',
five' and `six'
(Lily Earl [2000], A comprehensive Bulgarian grammar for foreign learners, Daniela Ubenova, Sophia, p. 153)
They go beyond six, though the higher the number, the less natural they sound. `Seven Brides for Seven Brothers' is Sedem nevesti za sedmina bratja, always. Otoh, The Seven Samurai is Sedemte samurai, not Sedminata samurai. It's a stylistic choice. (Ivan A Derzhanski, email 2010/06/20)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#MainVerb
Type="main"
Macedonian: We distinguish three types of verbs: main, auxiliary and modal. The word bi is considered as particle, rather than verb copula.
Resian: Verbs that can be of more than one Type are tagged according to their actual function, eg. byt / to be can be main, copula or auxiliary.
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#MasculineGender
e.g., henten, hentie/henten, hentoho/henten, hentí/henten, hentých/henten, najžulovitejšie/žulovitý, najžulovitejšieho/žulovitý, najžulovitejšiemu/žulovitý, najžulovitejšom/žulovitý (sk)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#MedialVoice
Voice="medial"
(Russian)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#ModalParticle
Type="modal"
Bulgarian: Type=modal(o) – used to express urge or order, mostly homonymous with other types of particles, for instance да, дано, нека, хайде. (Dimitrova et al. 2009)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#ModalVerb
Type="modal"
Macedonian: We distinguish three types of verbs: main, auxiliary and modal. The word bi is considered as particle, rather than verb copula. (MTE v4)
Persian: "Modal type of verbs is used to change the aspect of verbs to Subjunctive. Usually they come before Main verbs in present subjunctive form so the Main verb will have normal inflectional attributes. But if the Main verb appears in past 3rd person form, then the construction will be impersonal. Modal verbs usually are not inflected by number and person. However, there is an exception for the verb '( 'ﺗﻮاﻧﺴﺘﻦtavânestan) that can be inflected for person and number." (Qasemizadeh and Saeed Rahimi 2006)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#ModificationType
Determiner/Modific_Type Modific_Type: refers to the prenominal or postnominal positions of Determiners which distinguish different forms in Romanian.
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#ModifierAdverb
type="modifier" (English, Romanian, Hungarian)
Romanian: The distinction proposed here considers the principal syntactic properties of the adverbs. For Romanian, the general type includes most of the pronominal adverbs (demonstrative: aici (here), indefinite: oriunde (anywhere)). As argued before for pronouns and determiners, a distinct negative value is needed for adverbs as well (nicăieri - nowhere, niciodată - never). The particle type covers those adverbs which can dislocate verbal compound forms (ex. Ea a tot cântat -- She has ever sung) or mark degrees (ex. circa (about), foarte (very), prea (too)). Such adverbs are cam, mai, prea, şi, tot, foarte etc. A useful distinction in Romanian considers the adverbs which can have predicative role, that is they can govern a subordinate sentence (ex. Fireşte că o ştiu -- Certainly I know it). Here (for uniformity within a multilingual environment), they are squeezed into the modifier class. No formal distinction is made between the interrogative adverbs and the relative ones. The "portmanteau" type of adverb was introduced to cover some few words which can be both adverbs and conjunctions (with adverbial reading more frequent). This was necessary for tagging purposes.
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#MoodInterjection
Type="mood" (Hungarian)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#MorphologicalDerivation
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#MorphologicalFormOfNumeral
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#MorphosyntacticCategory
Top-level categories as specified under http://nl.ijs.si/ME/V4/msd/html/msd.cats.html. Subordinate categories reflect "Type" and related attributes.
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#MorphosyntacticFeature
Morphosyntactic features as specified under http://nl.ijs.si/ME/V4/msd/html. Note that attribute like "type" are represented as subcategories of MorphosyntacticCategory, cf. remarks there.
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#MultipleNumeral
Numeral/Type="multiple"
Romanian: Traditional Romanian grammars usually distinguish seven numeral types, where five of them have specific forms and the other two are obtained by composition. The first group is made up by the following numeral types: cardinal (trei-three), ordinal (al treilea-the third), fractional (treime-one third), multiple (întreit-trine), collective (amândoi-both). The second group contains the numeral types which are composed by means of other parts of speech: distributive (câte trei-...each three...), adverbial (de trei ori-thrice) and again the collective numeral which also has compound forms (toţi trei-all three). Nonetheless, as the numerals of the second group have a weak syntactic cohesion, namely each composition element may be regarded as an element of the sentence, with its own grammatical function, these last numeral types are irrelevant for the morphosyntactic annotation.
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#MultiplicativeCase
Case="multiplicative" (Hungarian)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#Negated
Negation="yes"
Resian: Negative is always marked as 'n' except for two verbs: 'nïman' / not to have, 'nïsi' / not to be.
Slovak: Negative verbs are marked as Negative=y, whereas non-negative verbs are marked as Negative=n. Verbs form negative by prefix 'ne-', with the exception of the verb "byť" (E. "to be") which forms the neagative in indicative by using separate particle "nie", e.g. "nie je" (is not). Here, "je" would be marked as negative, despite having positive form.
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#Negation
Negative: the value 'yes' encodes negative verbal word-forms in Slavic languages and Estonian.
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#NegativeAdverb
Type="negative"
Romanian: The distinction proposed here considers the principal syntactic properties of the adverbs. For Romanian, the general type includes most of the pronominal adverbs (demonstrative: aici (here), indefinite: oriunde (anywhere)). As argued before for pronouns and determiners, a distinct negative value is needed for adverbs as well (nicăieri - nowhere, niciodată - never). The particle type covers those adverbs which can dislocate verbal compound forms (ex. Ea a tot cântat -- She has ever sung) or mark degrees (ex. circa (about), foarte (very), prea (too)). Such adverbs are cam, mai, prea, şi, tot, foarte etc. A useful distinction in Romanian considers the adverbs which can have predicative role, that is they can govern a subordinate sentence (ex. Fireşte că o ştiu -- Certainly I know it). Here (for uniformity within a multilingual environment), they are squeezed into the modifier class. No formal distinction is made between the interrogative adverbs and the relative ones. The "portmanteau" type of adverb was introduced to cover some few words which can be both adverbs and conjunctions (with adverbial reading more frequent). This was necessary for tagging purposes.
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#NegativeDeterminer
Determiner/Type="negative" (Romanian)
Romanian: The need for a negative value of the determiners' Type attribute is argued on the same lines as in the section on pronoun' s Type. In Romanian the negative determiner is expressed by the unit nici + indefinite article (e.g. nici un, nici o).
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#NegativeParticle
e.g., n-/nu, nu (ro)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#NegativePronoun
Pronoun/Type="negative" (Romanian, Slavic)
Romanian: In Romanian it is worth differentiating the negative pronoun from other indefinite pronouns: a negative pronoun cannot be an argument for a verb unless the verb itself is negated too (e.g. Nu am văzut pe nimeni / *Am văzut pe nimeni).
Slovak: Negative and general pronouns ("general" Pronouns concern the Pronouns like "všetci" [E. "all"], "každý" [E. "every"] etc.) are important from the viewpoint of their syntactic distribution.
cf. Ukrainian EmphaticPronoun: The emp(h)atic Type of Pronoun is used for pronoun forms ні"кому, ні"чому, ні"чим, ні"кого, etc., with complex meanings like "there is nobody/nothing (to do sth/to use for doing sth, etc.)". Orthographically these are identical forms of negative nominal pronouns ніхто, ніщо "nobody, nothing" in oblique cases, however, with differing accent. They are referred to as either separate pronoun lexemes or predicatives in grammars. All Ukrainian emphatic pronoun forms include negation.
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#NegativeSubordinatingConjunction
Sub_Type="negative" (Romanian, Serbian, Russian)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#NeuterGender
Romanian: In Romanian the declension of a neuter noun always follows in singular a masculine paradigm and in plural a feminine one. Specific implementations could take advantage of this rule and by organizing the paradigmatic space in partial paradigms (masc-sing, masc-pl, fem-sing, fem-pl) to get rid of neuter value for the gender attribute.
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#NoClitic
Clitic="no" (Noun/Adjective: Romanian; Verb: Romanian, Polish, Serbian, Persian)
Slovak Pronoun: The Clitic attribute distinguishes clitical vs. nonclitical pronominal forms, e.g. "ti" vs. "tebe".
Romanian Verb, Noun, Adjective: The cliticization phenomenon in Romanian is not restricted to verb-pronoun relationship, but may also be observed with the (main) verb and the auxiliary, the noun or adjective with pronoun, with noun or adjective with copula, pronoun with auxiliary, preposition with (indefinite) article, numeral or (indefinite) pronoun, negative adverb with verb, auxiliary or pronoun, and some others (mainly created through the contracted forms of the verb "a fi"-to be). We restrict ourselves to considering only the graphically marked clicitizations. In such cases, the two, three or (sometimes) four constituents of a cliticized word-form are always separated by a hyphen. Omitting the hyphen in such cases is an unacceptable error in written Romanian.
Romanian Article: Note that the definite article has only enclitic forms, except for one proclitical form (lui + proper noun: lui Ion). The inflected forms of the foreign-origin words (mainly nouns) not fully assimilated, are usually written with a hyphen between the base-form and the inflectional ending. In our encoding, we classified these endings (which are supposed to be split by the segmenter) as clitic articles (clitic attribute is always "y") which can be either definite (type=f, "-istul") or indefinite (type=i, "ist") and are characterised by gender (gender=m, "ist"; gender=f, "istă"), number (number=s, "ist"; number=p, "işti") and case (case=r, "istul"; case=o, "istului").
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#NoHuman
Human="no"
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#Nominal
Pronoun/Syntactic_Type="nominal" (Slavic), Abbreviation/Syntactic_Type="nominal"
Slovak Pronoun: Pronouns are distinguished between having a (syntactically) nominal and (syntactically) adjectival function. All pronominal types except the demonstrative and possessive one can be nominal, and all except for the personal one can be adjectival.
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#NominalAdjective
Formation="nominal" (Czech)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#NominativeCase
e.g., eu, tu (ro)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#NonInitialCoordinatingConjunction
Coord_Type="non-initial" (English)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#NonNegated
Negation="no"
Resian: Negative is always marked as 'n' except for two verbs: 'nïman' / not to have, 'nïsi' / not to be.
Slovak: Negative verbs are marked as Negative=y, whereas non-negative verbs are marked as Negative=n. Verbs form negative by prefix 'ne-', with the exception of the verb "byť" (E. "to be") which forms the neagative in indicative by using separate particle "nie", e.g. "nie je" (is not). Here, "je" would be marked as negative, despite having positive form.
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#NoncliticElement
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#NonspecificPronoun
Pronoun/Type="nonspecific" (Russian)
Russian: Type=nonspecific marks the following Russian words: весь 'all', всякий 'any, every', сам 'oneself', самый 'the very', каждый 'every, each', иной 'other', любой 'any', другой 'other'. The name "nonspecific" follows Halliday's Introduction to Functional Grammar, Section 6.2.1.1.
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#Noun
http://nl.ijs.si/ME/V4/msd/html/msd.N.html
Gender and Number are the only attributes specified for all languages.
Slovak: Adjectival nouns (gazdiná, hostinský) are classified as nouns. Sometimes the distinction between noun and adjective is not as clear as we want (obchodný cestujúci).
Ukrainian: Gerunds are not differentiated, but could be treated as a special class of nouns, nota bene: they possess aspect.
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#Number
Hungarian has three types of number in the nominal inflection: 1. The number of the noun. 2. The number of owners that own the noun. 3. The number of the context given referent, which is some possession of the noun, i.e. belongs to the noun (anaphoric possessive).
Macedonian: The attributes Owner_Number and Owner_Gender are used for the pronoun negov. For other pronouns, ? is used.
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#Numeral
English: Numerals have not been subsumed under adjectives, pronouns, determiners, etc. because the internal structure of complex numerals is idiosyncratic. (MTE v.4)
Czech: Numerals have been specified as a separate category because of their specific syntactic distribution. We have specified two syntactic classifications by means of the attributes Type and Class; they concern different syntactic distributions. For instance "několik" (E. "several") will be characterized as: Type: cardinal Class: indefinite (MTE v.4)
Slovak: Numerals have been specified as a separate category because of their specific syntactic distribution. We have specified two syntactic classifications by means of the attributes Type and Class; they concern different syntactic distributions. For instance "niekoľko" (E. "several") will be characterized as: Type: cardinal, Class: indefinite. Note that difference between pronouns and these classes of numerals is fuzzy and many are indeed classified as pronouns. (MTE v.4)
Hungarian: Types like ordinal and cardinal are grouped as numerals. (MTE v.4)
Resian: A division of numerals according to their nominal, adjectival or adverbial function is not usual in Resian grammar. (MTE v.4)
Romanian: In Romanian (as in many other languages) several numerals have noun behaviour (some grammarians classify such numerals as nouns) with gender and declension of their own, which they preserve even in the composition of the superior order numerals; these are, for instance, sută (hundred), mie (thousand), milion (million) and miliard (billion). In a sentence most numerals may fulfill the function of other parts of speech like noun, determiner or adverb. (MTE v.4)
Within the part of speech Numeral the type multipl[icativ]e is defined, but to the Czech tagset a multiple numeral is an adverbial one (dvakrát ‘twice’), whereas to the Slovene tagset it is adjectival (dvojen ‘double’). (Derzhanski and Kotsyba 2009)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#NumeralAgreementClass
In most Slavic languages, Numerals and Quantifiers involve specific agreement patterns, e.g., in Russian:
(a) SingularQuantifier (MTE v4: Numeral/Class="definite1"): requires noun in nominative singular, e.g., один год "one year"
(b) PaucalQuantifier (MTE v4: Numeral/Class="definite234"): requires noun in genitive singular, e.g., два/три/четыре года "two/three/four years"
(c) PluralQuantifier (MTE v4: Numeral/Class="definite"):requires noun in genitive plural, e.g., пять/много/сколько/столько лет "five/many/how many/that many years"
Bulgarian has done away with the distinction between 4 and 5, and generalised the 2-4 form to all numerals (and some other quantifiers), but the others generally keep it. Also Slovene has a living dual (both Sorbians likewise, but they haven't been MTEd).
Some Czech feminine and neuter body parts have preserved dual forms, and if the noun is dual, so are its attributes (adjectives, pronouns). So 2 differs formally from 3-4. The corresponding agreement pattern is a DualQuantifier (MTE v4: Numeral/Class="definite2"). (Ivan A. Derzhanski & Christian Chiarcos)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#NumeralForm
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#NumeralThreeOrFour
Numeral/Class="definite34" (Polish, Czech)
cf. Slovak: Among the definite numbers there are four subclasses (definite1, definite2, definite34, definite) which differ in their syntactic distribution and contain the following numerals: {1}, {2,3,4}, {5,6,...}
Polish: The IPIC accommodability feature for numerals with its two values "agreeing" (congr) and "governing" (rec) is presented here as the combination of the Number and the Class attribute used for Czech: definite = rec (IPIC) = governing (g) pięć, pięciu, dwóch; definite34 = congr (IPIC) = agreeing (a), Number=p dwa, dwaj, trzy, trzej, cztery. Cf. for Czech: definite2 = congr (IPIC) = agreeing (a), Number=d; definite1 = congr (IPIC) = agreeing (a), Number=a.
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#NumeralTwoToFour
Numeral/Class="definite234" (Slovak)
Slovak: Among the definite numbers there are four subclasses (definite1, definite2, definite34, definite) which differ in their syntactic distribution and contain the following numerals: {1}, {2,3,4}, {5,6,...}
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#ObliqueCase
Case="oblique" (Romanian, Macedonian)
In the Romanian case system the value 'oblique' conflates 'genitive' and 'dative'. In the Macedonian case system the value 'oblique' conflates archaic forms of 'genitive', 'dative' and 'accusative'.
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#OrdinalAdjective
Type="ordinal" (Resian, Serbian, Ukrainian, Macedonian)
Macedonian: Words like prv, vtor (eng. first, second) are considered ordinal numerals. (Note in Adjective)
Ukrainian: Relative adjectives (Ukr. відносні прикметники) are here labelled "o(rdinal)" for the sake of continuity with the Slovene tagset, where this term translates Slovene vrstni (pridevniki).
(MTE v4)
about "ordinal": "actually relative,a mistranslation of the Slovenian term vrstni" (Derzhanski and Kotsyba 2009)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#OrdinalNumeral
Macedonian: Words like prv, vtor (eng. first, second) are considered ordinal numerals. The ordinal numerals have the same inflectional characteristics as adjectives.
(MTE v4)
Romanian: Traditional Romanian grammars usually distinguish seven numeral types, where five of them have specific forms and the other two are obtained by composition. The first group is made up by the following numeral types: cardinal (trei-three), ordinal (al treilea-the third), fractional (treime-one third), multiple (întreit-trine), collective (amândoi-both). The second group contains the numeral types which are composed by means of other parts of speech: distributive (câte trei-...each three...), adverbial (de trei ori-thrice) and again the collective numeral which also has compound forms (toţi trei-all three). Nonetheless, as the numerals of the second group have a weak syntactic cohesion, namely each composition element may be regarded as an element of the sentence, with its own grammatical function, these last numeral types are irrelevant for the morphosyntactic annotation.
(MTE v4)
Slovak/Bulgarian: Ordinal (qualitative) numerals have an enumerating property, through which one can determine the consecutive position of an object in an ensemble of homogenous objects: prvý deň, druhý mesiac, tretia sekunda; първи ден, втори месец, трета секунда /first day, second month, third second/. (Dimitrova et al. 2009)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#OrthographicalRepresentationOfNumeral
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#OtherInterjection
Interjection/Type="other" (Hungarian, as compared to Type="mood")
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#PartOfFixedExpression
Some forms can only be used in a fixed context, e.g., polsku in po polsku. They are classified as special kinds of adjectives in the IPIC. In the MTE version this information is preserved in the status of a "burkinostka". This term is devised by Magdalena Derwojedowa and refers to dependent words like Burkina which only make sense and can be morphosyntactically identified in a fixed combination (Burkina Faso).
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#Participle
Verb/VForm="participle" or Adjective/Type="participle"
Czech: Adjectival active and passive participles, e.g. "stojící" (E. "standing") or "udělaný" (E. "performed" or "done", cf. Note 4 above) are classified as adjectives.
(MTE v4)
Slovak: The 'past participle' in Slovak is used for expressing compound active past Tense and is encoded as: Type=p(articiple), Tense=pa(s)t.
(MTE v4)
Slovak/Bulgarian: Vform=participle(p) corresponds to Slovak L-participle, in Bulgarian called just the participle and is used to form the past tense or the conditional. In Bulgarian, it also includes past participle (говорено) /spoken/).
(Dimitrova et al. 2009)
Macedonian: The passive participle is used in verbal forms with the auxiliary ima / nema (eng. to have, to have not). The verbal adjective, in case it is used out of this construction, is considered as separate lemma.
(MTE v4)
Romanian: The following features are pertinent to those moods which permit an adjectival use, i.e. participle and gerund. However, the adjectival use of gerund is extremely rare (o mână tremurndă - a shaking hand) and therefore gender and number apply mainly for the participle. (MTE v4)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#ParticipleAdverb
Type="participle" (Slovene)
? Is this an "AdverbialParticiple" as in Russian ?
Macedonian: Verb forms gerund and adverbial participle are taken to be separate (noun and adverb) lemmas.
Resian: Adverbial participles like standard Slovene 'leže' / lying down are not attested for Resian.
Romanian: The distinction proposed here considers the principal syntactic properties of the adverbs. For Romanian, the general type includes most of the pronominal adverbs (demonstrative: aici (here), indefinite: oriunde (anywhere)). As argued before for pronouns and determiners, a distinct negative value is needed for adverbs as well (nicăieri - nowhere, niciodată - never). The particle type covers those adverbs which can dislocate verbal compound forms (ex. Ea a tot cântat -- She has ever sung) or mark degrees (ex. circa (about), foarte (very), prea (too)). Such adverbs are cam, mai, prea, şi, tot, foarte etc. A useful distinction in Romanian considers the adverbs which can have predicative role, that is they can govern a subordinate sentence (ex. Fireşte că o ştiu -- Certainly I know it). Here (for uniformity within a multilingual environment), they are squeezed into the modifier class. No formal distinction is made between the interrogative adverbs and the relative ones. The "portmanteau" type of adverb was introduced to cover some few words which can be both adverbs and conjunctions (with adverbial reading more frequent). This was necessary for tagging purposes.
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#Particle
Slovak: Particles form a separate part of speech category (see below) as is customary in Slovak grammars. (MTE v4)
In the Slovak MTE tagset, we simplified our task enormously by resigning the classification attempts (which can be analysed ad nauseam to an arbitrary precision (Šimková, 2004)), and all the articles have the same simple tag P. (Dimitrova et al. 2009)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#ParticleAdverb
Type="particle" (Romanian, Hungarian)
Slovak: Particles form a separate part of speech category (see below) as is customary in Slovak grammars.
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#ParticleFormation
Particle/Formation Formation: refers to the graphical components: simple, i.e. consisting of one word; compound, i.e. consisting of more than one word.
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#PartitiveCase
Case="partitive" (Estonian)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#PassiveVoice
Voice="passive"
Macedonian: Two types of (adjectival) participles exist: active and passive. Active corresponds to Macedonian L-form and passive to verbal adjective, neuter gender, singular. For example, nosel is encoded as VForm=Participle, Voice=Active, nosen as VForm=participle, Voice=Passive.
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#PastTense
Tense="past" (Ukrainian also for adjectives)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#PaucalNumber
Number="paucal" (Serbian Verb)
PaucalNumber is a form used with numerals from 2 to 4 (cf. PaucalQuantifier). (Ivan A. Derzhanski, email 2010/06/16)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#PaucalQuantifier
In many Slavic languages, numerals between 2 and 4 (and some quantifiers) involve a specific agreement patterns that is different from that of smaller and greater numbers. In Russian, for example, genitive singular is requires. These numerals and quantifiers with the same characteristics are referred to here as "paucal quantifiers". (cf. David Pesetsky, http://www.uni-leipzig.de/~jtrommer/Harvard/pesetsky.pdf)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#PerfectiveAspect
Aspect="perfective" (Noun: Polish; Verb: Slavic; Adjective: Polish, Ukrainian)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#Person
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#PersonOfObject
Hungarian verbs ... [have] two conjugations: definite and indefinite.
The indefinite conjugation is used:
The definite conjugation is used:
The term `conjugation', while traditional, is confusing here: it normally refers to a paradigmatic class, not to part of a lexeme's paradigm. What Hungarian has in fact is limited marking of the person of the direct object (object agreement) in the verb, with the caveat that a 3rd person object is only marked if it is definite, a 2nd person object is only marked if the subject is 1st person singular, and a 1st person object is never marked. (Ivan A. Derzhanski, email 2010/06/18)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#PersonalPronoun
Pronoun/Type="personal" and Pronoun/Referent_Type="personal"
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#PluperfectTense
Tense="pluperfect"
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#PluralNumber
e.g., aktuaalseimatesse/aktuaalseim, Baltimaadel/Baltimaad, Baltimaades/Baltimaad, endilt/ise, endist/ise, esimesteks/esimene, esimestele/esimene, esimestena/esimene, esimestes/esimene (et)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#PluralQuantifier
Numeral/Class="definite" (Czech, Polish, Slovak)
Slovak: Among the definite numbers there are four subclasses (definite1, definite2, definite34, definite) which differ in their syntactic distribution and contain the following numerals: {1}, {2,3,4}, {5,6,...}
Polish: The IPIC accommodability feature for numerals with its two values "agreeing" (congr) and "governing" (rec) is presented here as the combination of the Number and the Class attribute used for Czech: definite = rec (IPIC) = governing (g) pięć, pięciu, dwóch; definite34 = congr (IPIC) = agreeing (a), Number=p dwa, dwaj, trzy, trzej, cztery. Cf. for Czech: definite2 = congr (IPIC) = agreeing (a), Number=d; definite1 = congr (IPIC) = agreeing (a), Number=a.
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#PortmanteauAdverb
Type="portmanteau" (Romanian)
Romanian: The distinction proposed here considers the principal syntactic properties of the adverbs. For Romanian, the general type includes most of the pronominal adverbs (demonstrative: aici (here), indefinite: oriunde (anywhere)). As argued before for pronouns and determiners, a distinct negative value is needed for adverbs as well (nicăieri - nowhere, niciodată - never). The particle type covers those adverbs which can dislocate verbal compound forms (ex. Ea a tot cântat -- She has ever sung) or mark degrees (ex. circa (about), foarte (very), prea (too)). Such adverbs are cam, mai, prea, şi, tot, foarte etc. A useful distinction in Romanian considers the adverbs which can have predicative role, that is they can govern a subordinate sentence (ex. Fireşte că o ştiu -- Certainly I know it). Here (for uniformity within a multilingual environment), they are squeezed into the modifier class. No formal distinction is made between the interrogative adverbs and the relative ones. The "portmanteau" type of adverb was introduced to cover some few words which can be both adverbs and conjunctions (with adverbial reading more frequent). This was necessary for tagging purposes.
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#PortmanteauConjunction
Conjunction/Type="portmanteau" (Romanian)
Romanian: The "portmanteau" type of conjunction applies only to the word "şi" which can be both a coordonating conjunction and an adverb. The distinctionamong these interpretations is rather tricky for the average native speaker and was a constant source of noise in automatic tagging. Therefore, for the sake of automatic processing we defined this "portmanteau" type value.
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#PositiveDegree
English: Since many English comparatives and superlatives are formed with more/most, "positive" cannot be interpreted as "neither comparative nor superlative".
Slovak: The qualificative adjectives which have no degrees of comparison have the Degree value equal to p(ositive). The adverbs which have no degrees of comparison have the Degree value equal to p(ositive) – similarly as adjectives.
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#PositiveSubordinatingConjunction
Sub_Type="positive" (Romanian, Serbian, Russian)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#PossessiveAdjective
Type="possessive"
Adjective/Type="possessive" are denominal, not pronominal
(Ivan A Derzhanski, email 2010/06/09)
cf. "adjectival" for Bulgarian pronouns: Bulgarian has language specific Type=adjectival(a), for words like умно /cleverly, wisely, sensibly/, which are derived from adjectives. (Dimitrova et al. 2009, maybe referring to MTE 3)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#PossessiveArticle
Type="possessive"
Romanian: Although it presents only a few items, the article in Romanian has four types, unlike in most of the European languages. Beside the two recommended types: definite and indefinite which have the generally known semantic value, Romanian uses two additional types of articles, which are semantically subordinated to the definite article but which have special forms and meanings: (1) the possessive article (also called genitival article) is an element in the structure of the possessive pronoun, of the ordinal numeral (e.g. al meu (mine) and al treilea (the third)), and of the indefinite genitive forms of the nouns (e.g. capitol al cărţii (chapter of the book)). (2) the demonstrative article links a definite noun to its determinants, links a numeral or an adjective to a noun, and it is a constituent part of the relative superlative (e.g. fata cea mare (the elder girl), cel lenes, (the lazy), respectively prietenul cel mai bun (the best friend)).
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#PossessiveDeterminer
Determiner/Type="possessive" (English, Romanian, Persian)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#PossessivePronoun
Pronoun/Type="possessive" and Pronoun/Referent_Type="possessive"
Macedonian: Words like moj, tvoj (eng. my, your) are considered possessive pronouns.
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#PostnominalModification
Determiner/Modific_Type="postnomin" (Romanian)
Romanian: As mentioned in the corresponding section on Pronoun, the Modific_Type attribute is relevant for some determiners too. The prenominal determiner always precedes the noun (e.g.acest băiat - this boy), whereas the postnominal determiner appears only after the noun (e.g. băiatul acesta - this boy).
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#Postposition
Type="postposition"
English: Postpositions are rare in English. "possessive" 's and ' might be considered postpositions, especially if the alternative is to assign them to the unique membership class (where by definition they would be unrelated).
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#PremodifyingOrdinalNumeral
Numeral/Type="ordinal2" (Persian)
e.g., هجدهم/هجده نهم/نه دهم/ده (MTE v4)
when an ordinal numeral acts as an adjective (Hamidreza Kobdani, email 2010/06/15)
Behrang Qasemizadeh (email 2010/06/26)
In Persian a number can be inflected by two different suffix to express ordinal meaning. These suffixes are "om" and "omin". they both , more or less, have the same meaning; however they are different morphosyntactically; I have discussed this with number of collegues when I were back in Iran and most of them agreed about the proposed classification for ordinal and ordinal2. Let me give you an example:
"nafar" in Persian means person
"yek" in Persian means one
yek + om = yekom (i.e. First)
yek + omin = yekomin (i.e. also First)
the English phrase "first person" can be translated to Persian as follows:
nafar yekom
yekomin nafar
As you see, ordinal1 and ordinal2 appearances in a phrase is different.
=> ordinal2 premodifying, ordinal(1) postmodifying
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#PrenominalModification
Determiner/Modific_Type="prenomin" (Romanian)
Romanian: As mentioned in the corresponding section on Pronoun, the Modific_Type attribute is relevant for some determiners too. The prenominal determiner always precedes the noun (e.g.acest băiat - this boy), whereas the postnominal determiner appears only after the noun (e.g. băiatul acesta - this boy).
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#Preposition
Type="preposition"
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#PrepositionalCase
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#PresentTense
Tense="present" (Ukrainian also for adjectives)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#ProQuantifier
Numeral/Type="pronominal" (Slovene)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#Program
Residual/Type="program" For Slovene the Type attribute has been introduced on Residual, which distinguishes the values of "foreign", to mark a words in a strech of foreign language text, "typo", a mis-typed word, and "program", where the tokenisation program made a mistake. The second, and esp. the third value are useful for hand-annotation of corpora.
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#ProgressiveAspect
Aspect="progressive" (Noun: Polish; Verb: Slavic and Persian; Adjective: Polish, Ukrainian)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#Pronominal
Abbreviation/Syntactic_Type="pronominal" (Romanian)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#Pronoun
English: "General" pronouns are those which are not personal, possessive, demonstrative or reflexive. The choice of these four categories is based on distributional facts, though at a rather high level of abstraction. They enter into anaphoric dependencies which are signalled morphosyntactically and are therefore (in principle) more amenable to automatic detection. Most general pronouns do not, although they too sometimes encode number information.
(MTE v4)
Definiteness on pronouns: Bulgarian has definiteness, but it is present only for the possessive and reflexive types of pronouns, and for some general pronouns. Examples include: Possessive: Мой – моя - моят /my/ Твой - твоя – твоят /your, 2 p. sing/ Негов – неговия – неговият /his/ Reflexive: Свой – своя – своят, своя – своята, свое-своето, свои - своите /his, her, its, their own/ Adverb (Dimitrova et al. 2009)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#PronounForm
feature "Pronoun_Form"
Pronoun_Form:
used to encode weak and strong pronouns in Romanian.
For Romanian we need an attribute (called Pronoun_Form) to make the distinction between strong and weak forms of the same pronoun. All the weak forms can be adjoined to the adjacent words both proclitically or enclitically. In such cases the junction is always graphically marked by a hyphen between the pronoun and the neighboring word. The hyphen also marks possible elisions from either pronoun or the adjacent word. Although in traditional grammar books the demonstrative, int_rel and indefinite pronouns are not characterised by person, in our dictionaries they are recorded (for reasons beyond morpho-lexical encoding) as 3rd person (the same as nouns). However, for the automatic tagging this value has been marked as irrelevant.
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#ProperNoun
Type=Proper
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#QualificativeAdjective
Type="qualificative"
Slovak: Only qualificative (and passive participle) Adjectives can be specified for Degree.
Czech: Three deverbative adjectival participles, i.e. past active participle, passive participle and present active participle are not distinguished. They are conflated in the 'qualifica- tive' value of the Type attribute (Type=f).
Slovak: Adjectival active and passive participles, e.g. "stojaci" (E. "standing") or "urobený" (E. "made" or "done", cf. Note 4 above) are classified as (qualificative) adjectives.
Two deverbative adjectival participles, i.e. past passive participle and present active participle are not distinguished. They are conflated in the 'qualificative' value of the Type attribute (Type=f). Past active participle is for all purposes dead in Slovak, although the form sometimes appears in metalanguage usage.
Romanian: Although it is not common practice in Romanian linguistics, one could make the distinction between qualificative and determinative adjectives.
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#Quantifier
distinguishes subtypes of Numerals in Czech which have a distinct syntactic distributions: e.g. subclasses for 1, 2, 3&4, etc. are distinguished.
(MTE v4)
Czech: Numerals have been specified as a separate category because of their specific syntactic distribution. We have specified two syntactic classifications by means of the attributes Type and Class; they concern different syntactic distributions. For instance "několik" (E. "several") will be characterized as: Type: cardinal Class: indefinite
(MTE v4)
Bulgarian has no Class attribute. Slovak has possible values according to the cardinality of the number,
definite1(1) for “one”, definite2(2) for “two”, definite34(3) for “three” or “four”, definite(f) for “five or
more”, demonstrative(d) (toľko/that many/), indefinite(i) (niekoľko/several/), interrogative(q)(koľko/how
many/). Definite1, definite2, definite34 and definite are separated according to syntactical structures the
numerals impose on the governed nouns – definite1 requires the corresponding noun to be in nominative
singular, definite2 in nominative plural, definite34 nominative plural, definite genitive plural.
(Dimitrova et al. 2009)
Bulgarian equivalents of [Slovak] demonstrative, indefinite, interrogative are classified as pronouns of a respective Type (including relative), e.g. няколко ученика /a few students/ – indefinite pronoun + noun. or sometimes as adverbs. (Dimitrova et al. 2009)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#Question
Pronoun/Wh_Type="question" (English; cf. Pronoun/Type="interrogative")
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#Quotative
VForm="quotative" (Estonian)
A quotative is grammatical device to mark reported speech in some languages (http://en.wikipedia.org/wiki/Quotative), e.g., in Estonian.
‘Reportedly, while he was going (in his boat), he turned over.’
Ta olevat oma paadiga ümber läinud
He was_QUOTATIVE his_own boat_WITH over gone.
(Estonian translation of an example given under http://www.sil.org/linguistics/GlossaryOfLinguisticTerms/WhatIsAQuotativeEvidential.htm) (Heiki-Jaan.Kaalep, email 2010/06/22)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#ReciprocalPronoun
Pronoun/Type="reciprocal" (Persian, Estonian, Hungarian)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#ReductionFeature
Polish:
The IPIC flexeme winien and predicatives like rad are treated as short adjectives—Definiteness="short-art".
The Vocalicity of (a)gglutinated forms like -em vs -m is mapped on the Definiteness attribute with its values "(f)ull-art" and "(s)hort-art" respectively, meaning "full form" and "short form". The terms are very artificial, but this category is used due to the similarity of the phenomenon.
(MTE v4)
Etymologically speaking, CliticDeterminerType and ReductionFeature are the same: the ending of the full form was originally a cliticised demonstrative pronoun (just like the article in Bg or Ro), and the semantic distinction was [+/- definite], but it has shifted to [attributive:predicative] or some such on some occasions.
However, keeping them together wouldn't be correct: Bulgarian has preserved (to a limited extent) the old long form, and has a fourfold opposition of, say, nov : novi : novija : novijat, the first two members of which have counterparts in several Slavic languages (although the functions differ), while the second two are
restricted to the Balkan sprachbund. I'd call them [-article short], [-article full], [+article short] and [+article full] respectively.
(Ivan A. Derzhanski, emails 2010/06/18)
[T]he suffixation of an actual pronoun to the adjective ... for the Balto-Slavic definite adjective inflection ... was used with definite nouns ... the Balto-Slavic forms show clear evidence of a well-attested IE pronoun suffixed to the adjective
Thomas McFadden (2004), On the pronominal origins of the Germanic strong adjective inflection, http://ifla.uni-stuttgart.de/institut/mitarbeiter/tom/downloads/gmcadj.pdf (to appear in Münchner Studien zur Sprachwissenschaft)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#ReflexivePronoun
Pronoun/Type="reflexive"
Slovak: Type=reflexive ecompasses all reflexive pronouns (sa, sebe, si, svoj, seba) as well as "sa" in its role as the obligatory particle of reflexive verbs. Personal and possessive reflexives are further distinguished via the Referent_Type attribute. "sa" in all its roles will be marked as the reflexive personal clitic pronoun.
Czech: Type=reflexive ecompasses all reflexive pronouns (ªseº, ªsebeº, ªsiº, ªsvůjº) as well as "se" in its role as the obligatory particle of reflexive verbs. Personal and possessive reflexives are further distinguished via the Referent_Type attribute. "se" in all its roles will be marked as the reflexive personal clitic pronoun.
Resian: In the modern language 'swöj' / own is an adjective.
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#Relative
Wh_Type="relative" (cf. Pronoun/Type="relative") (English)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#RelativeDeterminer
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#RelativePronoun
Pronoun/Type="relative"
Bulgaran has Type=relative(r) (e.g. който), which in Slovak would be formed by two consequent pronouns (ten, ktorý). (Dimitrova et al. 2009)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#RelativeQuantifier
Numeral/Class="relative" (Czech)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#RepetitiveCoordinatingConjunction
Coord_Type="repetit" (Romanian)
Romanian: In Romanian, there are three kinds of conjunctions depending on their usage: as such or together with other conjunctions or adverbs: (1) simple, between conjuncts: Ion ori Maria (John or Mary); (2) repetitive, before each conjunct: fie Ion fie Maria fie... (either John or Mary or...) (3) correlative, before a conjoined phrase, it requires specific coordinators between conjuncts: atât mama cât şi tata (both mother and father).
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#Residual
Slovak: Special 'adverb prepositions' 'po, na, do', encountered in expressions like 'po anglicky', 'na zeleno', 'do modra' are classified as residuals. Traditional Slovak grammars do not like to consider them separate words, but more like modifiers of a following adverb. (MTE v4)
In Slovak, special 'adverb prepositions' (po, na, do), encountered in expressions like po anglicky, na zeleno, do modra are classified as residuals. Traditional Slovak grammars do not like to consider them separate words, but rather see them to be different part-of-speech, mostly an adverb (see interjections above), with a space inside. In corresponding Bulgarian expressions (e.g. на български), the residual will be classified as Sp (preposition). This is however just a difference in grammar description, not an inherent difference in the languages. (Dimitrova et al. 2009)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#RomanNumeral
Form="roman"
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#SecondPerson
e.g., thee/you, thou, weren't/be+not (en)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#SentenceCoordinatingConjunction
Coord_Type="sentence" (Serbian, Russian, Hungarian)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#ShortArticle
Definiteness="short-art" (Noun: Bulgarian; Verb: Polish, Russian, Bulgarian; Adjective: : Polish, Russian, Ukrainian, Bulgarian; Pronoun: Polish, Bulgarian)
Bulgarian:
For singular masculine, there are two forms: a full article(f)[l.s.] and a short article(s)[l.s.]. The full
article is used when a singular masculine form is the syntactic subject of the clause, otherwise a short one is
used – a purely orthographic rule. The distinction of full vs. short is not made for feminine, neuter and plural
forms, and we use just the yes(y) or no(n) to mark definiteness or respectively lack thereof. Therefore, the
definiteness attribute can take overall 4 different values: indefinite(n), definitive(y), short article(s), full
article(f)
e.g., мъж, мъжа, мъжът /a man, the man[short], the man [full]/
(Dimitrova et al. 2009)
Polish:
The IPIC flexeme winien and predicatives like rad are treated as short adjectives—Definiteness="short-art".
The Vocalicity of (a)gglutinated forms like -em vs -m is mapped on the Definiteness attribute with its values "(f)ull-art" and "(s)hort-art" respectively, meaning "full form" and "short form". The terms are very artificial, but this category is used due to the similarity of the phenomenon.
MTE v.4
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#SimpleAdposition
Adposition/Formation="simple"
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#SimpleConjunction
Formation="simple"
Romanian: As with prepositions, we can distinguish two kinds of conjunctions in Romanian: (1) simple conjunctions: e.g. şi,dar,deşi etc. (2) conjunctions formed periphrastically, with some word/phrase combined by a conjunction: din moment ce, fără să, fat,ă de cum etc.
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#SimpleCoordinatingConjunction
Coord_Type="simple" (Romanian, apparently in contrast to repetitive and correlative)
Romanian: In Romanian, there are three kinds of conjunctions depending on their usage: as such or together with other conjunctions or adverbs: (1) simple, between conjuncts: Ion ori Maria (John or Mary); (2) repetitive, before each conjunct: fie Ion fie Maria fie... (either John or Mary or...) (3) correlative, before a conjoined phrase, it requires specific coordinators between conjuncts: atât mama cât şi tata (both mother and father).
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#SimpleInterjection
Interjection/Formation="simple"
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#SimpleParticle
Particle/Formation="simple"
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#SingularNumber
e.g., 1900-tól/1900, 2050-től/2050, akiét/aki, Aladárét/Aladár, amelytől/amely, amiként/ami, amitől/ami, attól/az, azét/az (hu)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#SingularQuantifier
Numeral/Class="definite1" (Czech, Slovak)
Slovak: Among the definite numbers there are four subclasses (definite1, definite2, definite34, definite) which differ in their syntactic distribution and contain the following numerals: {1}, {2,3,4}, {5,6,...}
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#SociativeCase
Case="sociative" (Hungarian)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#SpecialNumeral
Numeral/Type="special"
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#SpecifierAdverb
Type="specifier" (English)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#StrongPronoun
Pronoun_form="strong" (Romanian)
For Romanian we need an attribute (called Pronoun_Form) to make the distinction between strong and weak forms of the same pronoun. All the weak forms can be adjoined to the adjacent words both proclitically or enclitically. In such cases the junction is always graphically marked by a hyphen between the pronoun and the neighboring word. The hyphen also marks possible elisions from either pronoun or the adjacent word. Although in traditional grammar books the demonstrative, int_rel and indefinite pronouns are not characterised by person, in our dictionaries they are recorded (for reasons beyond morpho-lexical encoding) as 3rd person (the same as nouns). However, for the automatic tagging this value has been marked as irrelevant.
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#Subjunctive
Resian: The subjunctive is formally identical to the imperative, with one form for the three persons in the singular and the forms for the 2nd and 3rd person plural being identical.
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#SubjunctiveParticle
Particle/Type="subjunctive" (Romanian)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#SublativeCase
Case="sublative" (Hungarian)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#SubordinatingConjunction
Type="subordinating"
In Romanian, each conjunction requires another mood, so that the diversity may be controlled by subcategorisation rules. This attribute distinguishes among the positive and negative conjunctions, providing means to control verbal double negation, (as in case of the negative pronouns, determiners and adverbs): nici NU am venit, nimeni NU vorbeşte, nici_un tren N-a trecut, nicăieri N-am văzut
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#SuperessiveCase
Case="superessive" (Hungarian)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#SuperlativeDegree
e.g., najžobravejšie/žobravo, najžoviálnejšej/žoviálny, najžoviálnejšia/žoviálny, najžoviálnejšie/žoviálne, najžoviálnejšie/žoviálny, najžoviálnejšieho/žoviálny, najžoviálnejšiemu/žoviálny, najžoviálnejšiu/žoviálny, najžoviálnejšom/žoviálny (sk)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#Supine
VForm="supine" (Slovene, Estonian)
Romanian: Traditionally, Romanian linguistics distinguishes between predicative and non-predicative moods. This distinction may be easily mapped into finite/non-finite dichotomy: indicative, subjunctive and imperative are finite; infinitive, participle and gerund are non-finite (only synthetic (non-compound) moods were mentioned; we use the opposition synthetic-analytic to distinguish between concatenative (synthetic) and compound (analytic) morpho-lexical phenomena). As only synthetic forms were considered, the values conditional and presumptive for the VForm attribute were left out in Romanian. Another value for VForm which was left out is the Supine. It appears mostly with a preposition, except for a few intransitive verbs when they are subordinated to the impersonal verb a trebui (must). Only the preposition allows for differentiating a supine from a participle-masculine-singular.
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#SyntacticType
Syntactic_Type:
used to distinguish the nominal and adjectival function of Pronouns in Croatian, Resian and Czech. Furthermore, in Slovene and Serbian, the adverbial function of certain Pronouns is distinguished. Also used in Abbreviations to signal the Part of Speech of the abbreviation; currently used only by Romanian and Estonian.
(MTE v4)
Czech: Pronouns are distinguished between having a (syntactically) nominal and (syntactically) adjectival function. All pronominal types except the demonstrative and possessive one can be nominal, and all except for the personal one can be adjectival.
(MTE v4)
Romanian: Syntactic_Type: useful for specifying the grammatical category of an abbreviation. Although the values for this attribute could range over the part of speech categories in the language, in Romanian most of the abbreviations falls into noun class.
(MTE v4)
Syntactic_Type in Slovak can be nominal(n) or adjectival(a) (e.g. ktorý, môj), which is absent in the Bulgari- an language (there are no adjectival pronouns of this type). Slovak also has several quasi-adjectival pronouns classified as Syntactic_Type=a (e.g. tvoj), equivalents of which do exist in Bulgarian as well, but due to lack of the clear distinction of adjectival paradigm it was not felt unnecessary to introduce this value in Bulgarian MTE (Dimitrova et al. 2009)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#TemporalisCase
Case="temporalis" (Hungarian)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#Tense
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#TerminativeCase
Case="terminative" (Estonian, Hungarian)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#ThirdPerson
Romanian: Although in traditional grammar books the demonstrative, indefinite and int_rel determiners are not characterised by person, in our dictionaries they are recorded (for reasons beyond morpho-lexical encoding) as 3rd person (the same as nouns). However, for the automatic tagging this value has been marked as irrelevant.
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#Transgressive
VForm="gerund" is ambiguous: with respect
to Latin, in whose grammatical tradition it originates, it refers to a
deverbal noun, and is needed in this function for Polish as well; in
descriptions of some other languages, however, it has been used for an
adverbial participle. The two meanings have nothing in common, except
that the English ing-form can translate both.
(Ivan A Derzhanski, email 2010/06/09)
Identified with transgressive:
Vform=transgressive(t)[l.s]. in Slovak corresponds to VForm=gerund(g) in Bulgarian – this is just a
difference in description. (Dimitrova et al. 2009)
There are redundant values, such as ‘transgressive’ and ‘gerund’ (values of the feature VForm of the part of speech Verb), which refer to the same category, but the former is used in the tagsets for Czech and Slovak and the latter for Bulgarian and Serbian.
(Derzhanski and Kotsyba, 2009)
MTE 4:
VForm="transgressive" (Czech, Slovak)
Czech: The term transgressive roughly corresponds to the term 'verbal participle'.
Slovak: The term transgressive roughly corresponds to the term 'verbal participle'. The transgressives have present tense, and do not distinguish any other categories except for negativeness.
(Note: this entails that Transgressive is-a Participle)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#Transitive
Transitive="yes" (Persian)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#Transitivity
feature "Transitive"
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#TranslativeCase
Case="translative" (Estonian)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#Typo
Residual/Type="typo" For Slovene the Type attribute has been introduced on Residual, which distinguishes the values of "foreign", to mark a words in a strech of foreign language text, "typo", a mis-typed word, and "program", where the tokenisation program made a mistake. The second, and esp. the third value are useful for hand-annotation of corpora.
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#UniquitiveDeterminer
Determiner/Type="exceptional" (Persian)
i.e. تنها
it is the uniquitive determiner: "the only" (Hamidreza Kobdani, email 2010/06/15)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#Verb
e.g., budete/byť, buďte/byť, majte/mať, máte/mať, nepostačia/nestačiť, nepostačí/nestačiť, nepostačím/nestačiť, nepostačíme/nestačiť, nepostačíte/nestačiť (sk)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#VerbForm
Feature "VForm" or verbs
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#Verbal
Abbreviation/Syntactic_Type="verbal"
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#VerbalAdverb
Type="verbal" (Serbian, Macedonian, Hungarian)
Macedonian: Verbal adverbs (gerunds) like odejkji are not considered as verbal forms, but as separate adverbial non inflective lemmas
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#VerbalParticle
Type="verbal" (Bulgarian)
Type=verbal(v) is used to form different type of verbal syntactical relationships, e.g. to create future tense (ще говориш), or particles like се, да – Slovak uses very different verbal syntactical structures (Dimitrova et al. 2009)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#VocativeCase
Macedonian: Two vocative forms exist with same MSD, e.g. narode / narodu (narod) are both Ncmsvn.
Slovak: Slovak distinguishes 7 cases, the locative case being obligatorily prepositional. Vocative is identical with nominative, with the exception of several nouns and (substandard usage of) some proper names. Here, vocative is marked according to its syntactic role. 'ty' (E. 'you') is usually vocative. Many other pronouns can be marked as vocative because of their syntactical position, e.g. in 'môj bože' (E. 'my god'), 'môj' is vocative.
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#Voice
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#WHType
Pronoun/Wh-Type, Determiner/Wh_Type
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#WeakPronoun
Pronoun_Form="weak" (Romanian)
For Romanian we need an attribute (called Pronoun_Form) to make the distinction between strong and weak forms of the same pronoun. All the weak forms can be adjoined to the adjacent words both proclitically or enclitically. In such cases the junction is always graphically marked by a hyphen between the pronoun and the neighboring word. The hyphen also marks possible elisions from either pronoun or the adjacent word. Although in traditional grammar books the demonstrative, int_rel and indefinite pronouns are not characterised by person, in our dictionaries they are recorded (for reasons beyond morpho-lexical encoding) as 3rd person (the same as nouns). However, for the automatic tagging this value has been marked as irrelevant.
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#WithCliticS
feature Clitic_s: the 'yes' value of the Clitic_s attribute denotes Czech pronouns having the clitic morpheme 's' appended as a suffix.
Czech: The 'yes' value of the Clitic_s attribute denotes a verbal form having the clitic morpheme 's' appended as a suffix. This 's' morpheme expresses 2nd Person singular present Tense of the auxiliary Verb "být" (i.e. the form "jsi"). There is no intermediate hyphen between the verbal form and the 's' morpheme.
The Clitic_s attribute is specified for VForm=infinitive (VForm=n) and Vform=p(articiple) only.
(MTE v.4)
In Czech the 2nd person singular present tense form of the copula jsi
can be cliticised as -s on certain non-finite verb forms and pronouns, and its presence is indicated by
the positive value of the binary feature Clitic_s of the parts of speech Verb and Pronoun. Essentially
the same phenomenon exists in Polish, but it involves four cliticised forms of the copula (1sg -m, 1pl -
śmy, 2sg -ś, 2pl -ście), and they float more freely (the host can be any content word, e.g. świniaś ‘thou
art a pig’, dobryś ‘thou art good’)
(Derzhanski and Kotsyba 2009)
Therefore modeled here as subClass of Clitic.
Clitic_s="yes" (Czech)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#WithCourtesy
Courtesy="yes" (Slovene/Resian, Persian)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#WithoutCliticS
feature Clitic_s: the 'yes' value of the Clitic_s attribute denotes Czech pronouns having the clitic morpheme 's' appended as a suffix.
Czech: The 'yes' value of the Clitic_s attribute denotes a verbal form having the clitic morpheme 's' appended as a suffix. This 's' morpheme expresses 2nd Person singular present Tense of the auxiliary Verb "být" (i.e. the form "jsi"). There is no intermediate hyphen between the verbal form and the 's' morpheme.
The Clitic_s attribute is specified for VForm=infinitive (VForm=n) and Vform=p(articiple) only.
(MTE v.4)
In Czech the 2nd person singular present tense form of the copula jsi
can be cliticised as -s on certain non-finite verb forms and pronouns, and its presence is indicated by
the positive value of the binary feature Clitic_s of the parts of speech Verb and Pronoun. Essentially
the same phenomenon exists in Polish, but it involves four cliticised forms of the copula (1sg -m, 1pl -
śmy, 2sg -ś, 2pl -ście), and they float more freely (the host can be any content word, e.g. świniaś ‘thou
art a pig’, dobryś ‘thou art good’)
(Derzhanski and Kotsyba 2009)
Therefore modeled here as subClass of Clitic.
Clitic_s="no" (Czech)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#WithoutCourtesy
Courtesy="no" (Slovene/Resian, Persian)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#WordsCoordinatingConjunction
Coord_Type="words" (Serbian, Russian, Hungarian)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#hasAdjectiveFormation
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#hasAdpositionFormation
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#hasAnimacy
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#hasAspect
also applicable to Polish nouns, and Polish and Ukrainian adjectives
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#hasCase
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#hasClitic
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#hasConjunctionFormation
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#hasCourtesy
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#hasDefiniteness
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#hasDegree
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#hasFeature
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#hasFormation
Formation: refers to the graphical components: simple, i.e. consisting of one word; compound, i.e. consisting of more than one word.
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#hasGender
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#hasHumanness
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#hasInterjectionFormation
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#hasModificationType
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#hasNegation
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#hasNumber
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#hasNumeralForm
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#hasOwnedNumber
feature "Owned_Number" (Pronoun and PronominalAdjectives; Hungarian Noun, Numeral) Owned_Number: in the Hungarian system, different word-forms are distinguished for nominals on the basis of so called 'anaphoric possessive' number, i.e. the number of the thing(s) possessed by the nominal in question.
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#hasOwnerGender
Pronoun/Owner_Gender Owner_Gender: used to encode the Gender of the possessor in Pronouns and (in Romanian) Determiners.
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#hasOwnerNumber
feature "Owner_Number" (Pronoun, for Hungarian also Noun and Adjective) Owner_Number: used to specify the possessor number in Pronouns, as well as (in Romanian) in Determiners, and (in Hungarian) in Adjectives and Nouns.
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#hasOwnerPerson
feature "Owner_Person" (Pronoun (and pronominal Adjectives), Noun: Hungarian) Owner_Person: used to specify the possessor person in in Hungarian in Adjectives and Nouns.
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#hasParticleFormation
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#hasPerson
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#hasPronounForm
Pronoun/Pronoun_Form
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#hasQuantifier
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#hasSubCase
A SubCase refers to non-standard cases, i.e., a grammatical differentiation that occurs in few inflection paradigms that is
regularly expressed by a single case.
Some Russian genitive nouns take the non-standard ending "-у,-ю" in genitive to express partitive meaning ("чашка горячего чаю") or in prepositive (locative) to express locative meaning ("на шкафу").
(MTE 4; Serge Sharoff)
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#hasSyntacticType
Pronoun/Syntactic_Type and Abbreviation/Syntactic_Type
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#hasTense
also applied to Ukrainian Adjective
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#hasTransitivity
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#hasVerbForm
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#hasVoice
applied to Polish and Hungarian adjective
IRI: http://nl.ijs.si/ME/owl/multext-east.owl#hasWHType
This HTML document was obtained by processing the OWL ontology source code through LODE, Live OWL Documentation Environment, developed by Silvio Peroni.
e.g., АЕС АЗС АНДР АПК АСЕАН АТ АТП АТР АТС БРСР ВДНГ ВІЛ ВЛКСМ ВМС ВМСУ ВПК ВПС ВР ВУАН ВУЦВК (uk)