Map of all the languages in the SCOPIC Project and language profiles

Language Profiles


ASF_MLKJRB_AB2_c12d close

Auslan (Australian sign language, ISO code: asf) is the most widely used deaf signed language in Australia. Auslan evolved from late eighteenth-century British Sign Language (BSL) via deaf immigrants and teachers of the deaf from Britain (Johnston & Schembri, 2007). Auslan has since developed within social networks of deaf signing families, residential schools for deaf children, and social groups such as religious organisations and state deaf societies (see Schembri, et al., 2010; Carty, 2018). Conservative estimates put the number of profoundly deaf signers at 6,500 (Johnston, 2004). However, the number of people who use Auslan every day is assumed to be much higher. In addition to profoundly deaf children and their families, many severely or even moderately deaf people use Auslan. There are also hearing signers including family members, friends, colleagues, professionals working in the deafness sector, and Auslan/English interpreters working with deaf signers. During the 2016 Australian Census, 10,114 people reported Auslan as a language other than English used at home (Australian Bureau of Statistics, 2017).

Auslan is an endangered language, mainly due to interrupted generational transmission and a subsequent population bias towards non-native signers, especially adult late learners. Only a small proportion of deaf signers learn Auslan from deaf parents or at a young age from peers at school (Johnston, 2004). It is more common for deaf people to learn signed language from peers at school or later as adults in the deaf community, rather than from their primary caregivers. Many signers can therefore be described as “new signers” who have experienced a non-traditional pathway to learning a signed language, often much later than childhood (De Meulder, 2018). This diversity results in signing ecologies that are extremely heterogenous. Most deaf signers tend to live in urban centres such as Sydney, Melbourne, Brisbane, Adelaide and Perth, but a significant number live in regional areas and are more isolated with respect to social networks with other signers and access to communication services such as Auslan/English interpreting.

The Auslan Family Problems data was collected as part of the Auslan and Australian English Corpus, which aims to facilitate direct comparison of face-to-face, multimodal communication of deaf signers and hearing non-signing speakers from the same city (Hodge, Sekine, Schembri & Johnston, 2019). It follows on from the earlier Auslan Corpus, which was created in 2008 and consists of 300 hours of digital video recordings from 255 deaf signers from five cities (see here for more information). The Auslan data described in the SCOPIC project is from five pairs of deaf native or near-native Auslan signers (i.e. people who learned Auslan before age 10), men and women between the ages of 30 and 65, who have been living in Melbourne for at least ten years. The data are therefore partly representative of the native signing minority from this region.

As Auslan is a quintessentially face-to-face language without a written form, Auslan signers habitually integrate strategies for describing, indicating and depicting during their everyday language use (see Johnston, 1996; Ferrara & Hodge, 2018). These include the conventionalised lexical Auslan signs documented in the lexical database Auslan Signbank and fingerspelling and/or mouthing of English words, as well as less conventionalised strategies such as manual depiction (i.e. use of one or both hands to depict the movement, location and/or shape and size of humans, animals and inanimate objects) and bodily enactment (i.e. use of the body to mimetically re-enact action, thought, and dialogue). Furthermore, manual signs are often (but not always) spatially motivated: many signs can be modified to index locations in the multidimensional space in front of the signer’s body (see Johnston & Schembri, 2007; 2010, for more detail on Auslan structure and the different types of signs used in deaf signed languages).


Australian Bureau of Statistics (2017).

Ferrara, L., & G. Hodge. (2018). ‘Language as description, indication, and depiction’. Frontiers in Psychology, 9:716. DOI:10.3389/fpsyg.2018.00716

Hodge, G., Sekine, K., Schembri, A. & T. Johnston. (2019). ‘Comparing signers and speakers: Building a directly comparable corpus of Auslan and Australian English’. Corpora, 14(1): 63-76. DOI:10.3366/cor.2019.0161

Johnston, T. (1996). “Function and medium in the forms of linguistic expression found in a sign language,” in International Review of Sign Linguistics, Vol. 1, eds W. H. Edmondson and R. B. Wilbur (Mahwah, NJ: Lawrence Erlbaum), 57–94.

Johnston, T. (2004). ‘W(h)ither the deaf community? Population, genetics and the future of Australian Sign Language’. Sign Language Studies, 6(2): 137-173.

Johnston, T., & Schembri, A. (2007). Australian Sign Language: An Introduction to Sign Language Linguistics. Cambridge: Cambridge University Press.

Johnston, T., Schembri, A. (2010). Variation, lexicalization and grammaticalization in signed languages. Langage Soc. 131 (March): 19-35.




Kogi (ISO code: kog) belongs to the Chibchan language family and, together with its neighboring languages Damaná and Ika (and extinct Kankuamo), comprises the Arwako subgroup. It is spoken by some 10000 people on the northern slopes of the Sierra Nevada de Santa Marta in the north of Colombia.

Multilingualism is common in the Kogi community: In addition to Spanish as a language of wider communication, many who live in border areas to one of the neighboring indigenous groups also speak Ika or Damaná. While the use of Spanish is on the rise, there remain many monolingual speakers, and Kogi is still acquired by children and used by all generations of speakers.

The Kogi maintain strong religious and cultural traditions, and essentially live self-sustainingly based on agriculture and animal husbandry. While most of the time, they live on small farms scattered in the Sierra, each family pertains to a bigger village where they occasionally unite for ceremonies and communal work. In each village, the mama plays the most important societal role and acts as priest, judge, or medicine man.

Structural features

Kogi has the basic constituent order SOV and exhibits split-ergativity in noun phrase marking: First and second person pronouns follow the nominative-accusative pattern, whereas third person pronouns and common nouns take ergative-absolutive case marking. In addition, Kogi features split intransitivity with respect to its person marking system. While most intransitive verbs take subject person indexes (usually marking A), a number of intransitive verbs can be construed as non-volitional and take object person indexes (usually marking P). Noun phrase clitics mark a number of cases, namely ergative, genitive, dative and locative.

Kogi displays morpho-phonemically complex verb inflection that involves synthetic as well as periphrastic constructions. Inflectional categories marked on verbs include person, tense/aspect, modality/mood, and negation. Periphrastic constructions involve a number of auxiliary verbs that typically indicate tense and aspect.

Kogi also features a form of epistemic marking called ‘engagement’ (Evans et al. 2017a, b; cf. “complex epistemic perspective”, Bergqvist 2016). This grammatical expression is found with four prefixes that attach to the auxiliary verb in periphrastic verb phrases. These four forms may be divided into two sets that take the speaker and the addressee as origo. A focus on the perspective of the speaker is found with na-/ni-, where na- means that “the speaker knows x and expects the addressee to be unaware of x”, and ni– means that “the speaker knows x and expects the addressee to know x too”. na-/ni- are in turn contrasted to sha-/shi-, which encode a corresponding distinction in terms of non-shared/shared knowledge from the addressee’s perspective. sha- means that the speaker expects the addressee to know x while the speaker is unaware of x, and shi- means that the speaker expects the addressee to know x, and the speaker knows x too. shi-/sha- are used to signal the speaker’s acknowledgement of the addressee as primary knower, but at the same time encodes the speaker’s assertion (without reduced certainty) of a talked about event.

In addition to epistemic marking, further constructions reflecting aspects of social cognition that can be found in the corpus relate to event depictions. The applicative prefix k- is used to indicate that a participant benefits (benefactive) or suffers (malefactive) from an event. A further function associated with this morpheme is the expression of non-volitionality. Moreover, events are depicted in terms of how an individual participates in it by indicating causation (-gwi and –sha) or reciprocity (zhi-, al-).

Other resources

The pioneering work on the grammar of Kogi was carried out between the late 19th and mid 20th century (Celedón 1886; Preuss 1921-1922, 1923-1924, 1925, 1926, 1927; Holmer 1952, 1953). More recent resources comprise various partial grammatical descriptions: Gawthorne (1984), Hensarling (1991), Holmer (1952; 1953), Ortíz Ricaurte (1989; 1994; 2000), Olaya Perdomo (2000), Stendal (1968; 1976). Bergqvist (2016) details the function of epistemic marking, and Knuchel (submitted) discusses the role of shared attention/ knowledge in demonstratives.

The Kogi language is currently being documented in a research project at the University of Bern funded by the Swiss National Science Foundation (P0BEP1_165335). The goal of the project is an introduction to Kogi grammar with an in-depth description of verbal morpho-syntax.



Social background

Komnzo is the mother tongue of about 250 speakers in Rouku village,  Western Province, Papua New Guinea. It is spoken in the southern lowland area of the island of New Guinea. In its biota, the area is more similar to northern Australia than to the rest of the island. We find eucalypts, melaleuca, acacias and banksias combined with wallabies, bandicoots, goannas, taipans and termite mounds. An early anthropologist describes the landscape as having a “mild, almost dainty, attractiveness in detail, but […] on the whole the extreme of monotony” (Williams 1936: 1).

Komnzo speakers refer to themselves as the “Farem tribe”, the word farem is the name of an important place in the vicinity of Rouku village. However, the formation of villages is a colonial legacy. As most groups in this area, the Farem used to live in small hamlets, often just one or two patrilines of a clan. Komnzo speakers used to live a semi-nomadic life shaped by slash-and-burn agriculture combined with hunting, fishing and the gathering of various foods in the forest.

The staple food of the area is yam and the social significance of this bland tasting tuber can hardly be overstated. There are numerous rituals, spells and myths around yams and their cultivation. As other languages in the family, Komnzo has a senary numeral system with words for the powers of six running up to six to the power of six (= 46,656 or simply wi in Komnzo). The numeral system is employed to count one’s share of yams, a family’s annual harvest or a clan’s contribution to a feast. The counting procedure is ritualised and it often takes place in public accompanied by the sound of kundu drums. In short, yams indicate the wealth and social status of an individual or of a group.

The Farem still practice sister-exchange, whereby men from different places (villages, hamlets, etc.) marry each other’s true or classificatory sister. Exogamy is defined by a mixture of clan and place. Language (or variety) is defined by place. As a consequence, the Farem practise linguistic exogamy and children grow up in a highly multilingual society. It is quite common for a child to be fluent in three to four languages before entering the school system which uses English as its main language.

Structural features

Komnzo belongs to the Tonda subgroup of the Yam language family (formerly known as the “Morehead Upper-Maro group”). Komnzo forms a dialectal chain together with Wèré, Anta, Wára, Kémä and Kánchá. However, in the local linguistic ideology these are regarded as languages in their own right.

The most frequent word order is SOV. Noun phrases are marked for case with  absolutive/ergative alignment. In addition to ergative, absolutive, dative and possessive, there are ten non-core cases (e.g. locative, instrumental,  temporal, associative).

Komnzo has a rich verbal morphology indexing person, gender and number of up to two arguments. Additionally, verbs encode 18 TAM categories, valency, directionality and deictic status. Morphological complexity lies not only in the amount of categories that verbs may express, but also in the way these are encoded. Komnzo verbs exhibit what has been called ‘distributed exponence’ (Carroll 2017). Distributed exponence is characterised by the fact that single formatives are underspecified for a particular grammatical category. Therefore, morphological material from different sites has to be integrated first, and only after this integration can one arrive at a particular grammatical category.

An example is the way how number is encoded in the language. Number marking for nominals as well as for the affixes on verbs makes a binary distinction between singular and non-singular number. In order to encode the three number values of singular, dual and plural, verbs have a special affix that makes a binary distinction of dual versus non-dual. Hence, a singular value is encoded by a combination of singular/non-dual, a plural value by non-singular/non-dual, whereas a dual value is expressed by the combination of non-singular/dual. The morphological system is rather economic in that it employs all possible combinations. For a subset of verbs (stative or positional verbs) the system allows for a fourth number category, the large plural, which is encoded by the somewhat odd combination of singular/dual. The principle of distributed exponence extends to other grammatical categories like aspect and tense.

Together with Nen, a Nambu language of the Yam family, Komnzo was the target of a documentation project funded by the DOBES project of the Volkswagen Foundation. The project website can be found here ( All collected materials are archived at MPI digital archive in Nijmegen ( A descriptive PhD grammar including a dictionary and sample texts was one of the outcomes of the project (Döhler 2016).


 Carroll, Matthew J. 2017. The Ngkolmpu language: With special reference to distributed exponence. PhD thesis. Australian National University. URL:

  • Döhler, Christian. 2016. Komnzo: a language of Southern New Guinea. PhD thesis. Australian National University. URL:
  • Döhler, Christian. forthcoming. A grammar of Komnzo. Berlin: Language Science Press.
  • Williams, FE. 1936. Papuans of the Trans-Fly. Oxford: Clarendon Press.

Matukar Panau

Rosa and Wendy SCOPIC

Matukar Panau is a highly endangered Oceanic language spoken around 45 km north of Madang, Papua New Guinea. Matukar is a village with around 500 people and Surumarang is a smaller hamlet with around 200 people. Of these 700 people, most (540) are under 30 years old and are unlikely to speak more than very basic Matukar Panau. Their first and dominant language is the English-based creole Tok Pisin. Another 130 or so people are between 30 and 50 years old. Their first language is Matukar Panau, but many speaking instead primarily Tok Pisin. The dominant language for most of these people is certainly Tok Pisin, but they can and will still use Matukar Panau, and will often do code-switching with Tok Pisin. Around 25 people are over the age of 50, and while these speakers are also Tok Pisin speakers, they still speak Matukar Panau often and well.

Although the youngest adults and children speak primarily Tok Pisin although there is still societal value in conducting small social rituals in Matukar Panau as opposed to Tok Pisin. For instance, greetings are often done in Matukar Panau such as good morning (tidom mami uyan), good day (sabi uyan), good afternoon (raurau uyan) or good night (tidom uyan), how are you doing? (uyan madonggo [sitting good] or uyan turago [standing good] or mateng ti, abab ti [no sickness, no wounds]?). Someone who primarily speaks Tok Pisin, may still ask in Matukar Panau if someone has betel nut (mariu), lime (kau), mustard (ful) or cigarettes (kas).The person asked is expected to give one or two small items to the asker.

In addition to the Matukar Panau-Tok Pisin bilingualism, many villagers, especially older villagers, speak another indigenous language. There are many exogamous marriages and so some people have learned the language of their spouse or parent from another village. People may speak a Papuan language like Bargam, or closely related Oceanic language like Takia, or both. Some spouses of native Matukar villagers have also learned Matukar Panau to some extent. Other languages people speak include: Gedaged, English, Manam, Ngain, Pelipoai, Riwo, Waskia, Widar, Yamai, and Yoidik.

Therefore the language situation is complex, with prevalent multilingualism, with children unlikely to learn the language productively, but with many people still having a strong association between language and belonging.

Matukar Panau has interesting typological features such as multiple complex clause types, complex nominalizations, multiple possession strategies and variation of phonology and semantics due to gender and social network. Many of these interesting features seem to have developed due to the long contact and multilingualism with Papuan languages.

Matukar Panau language documentation is only possible with the help of the Matukar Panau Transcription and Translation team: Rudof Raward, Justin Willie, Alfred Sangmei, Amos Sangmei, Micheal Balias, and Zebedee Kreno† and the help from consultants to edit the data. The primary language consultant is Kadagoi Rawad Forepiso. Other consulting help has come from Kennedy Barui†, Cathy Samun Williang, Taleo Kreno, Berry Barui and John Bogg.




Sanzhi Dargwa is a Nakh-Daghestanian language from the Dargwa (or Dargi)[1] subbranch and belongs to the South Dargwa varieties. Sanzhi Dargwa is spoken by approximately 250 speakers and is heavily endangered. The self-designation of the Sanzhi people is sunglan-te (Sanzhi.person-pl) and the language is called sunglan ʁaj (Sanzhi.person language).

More than 40 years ago all Sanzhi speakers left the village of Sanzhi, their village of origin, in the Caucasian mountains. Sanzhi is located in the Daxadajevskij Rajon in central Daghestan, which is predominantly inhabited by speakers of Dargwa languages.

The village of Sanzhi is located in the valley of the river Ullučaj, at an altitude of about 1500 meters. Today, the majority of Sanzhi speakers live in the village of Druzhba in the Daghestanian lowlands (Kajakentskij Rajon) and to a lesser extent in other settlements in Daghestan and other parts of Russia.

Sanzhi belongs to the Dargwa (Dargi) languages which form a subgroup of the Nakh-Daghestanian language family. Sanzhi Dargwa is typologically similar to other Nakh-Daghestanian languages. It has a relatively large consonant inventory including pharyngeal and ejective consonants. With respect to its morphosyntactic structure, Sanzhi is predominantly dependent-marking with a rich case inventory. The grammatical cases are ergative, absolutive, dative and genitive. In addition, there is a plethora of spatial cases. The morphology is concatenative and predominantly suffixing. Sanzhi has a rich system of verbal TAM forms and spatial preverbs. Salient traits of the grammar are two largely independently operating agreement systems: gender/number agreement and person agreement. Gender/number agreement operates at the phrasal and at the clause level. Within the clause it is mainly controlled by arguments in the absolutive case and shows up on verbs, adverbs and on nouns in some of the spatial cases. Person agreement operates at the clausal level only and functions according to a person hierarchy. Sanzhi has ergative alignment. SOV is the most frequent word order.

Sanzhi Dargwa is an unwritten language that does not have any official status. Diana Forker is currently finishing a descriptive grammar of the language. Various topics in the morphosyntax of Sanzhi and other aspects of Sanzhi have been treated in Forker (Accepted a, b; Submitted a, b, c, d; 2016, 2014). A collection of texts with Russian translations, a Sanzhi-Russian and a Russian-Sanzhi dictionary can be found in Forker & Gadzhiev (2017).

Sanzhi is currently documented within the project, Documenting Dargi languages in Daghestan – Shiri and Sanzhi, funded by the DoBeS program of the Volkswagen Foundation. The project officially started in 2012 and will run until 2019. Within this project, three linguists (Diana Forker, Rasul Mutalov, Oleg Belyaev), one anthropologist (Iwona Kaliszewska) and student assistants from the university of Bamberg document, describe and analyze the two endangered Nakh-Daghestanian languages Shiri and Sanzhi Dargwa.

Detailed information about the project, the languages and many texts, recordings and pictures can be found on the project website All materials gathered in the project are accessible upon restriction via the Language Archive hosted by the MPI Nijmegen ( A subcoprus of 45.000 tokens has been fully glossed with FleX and translated into Russian and English (

The electronic dictionary of Sanzhi was built up with Lexique Pro is accessible via the project homepage at

[1] There is no homogenous English terminology referring to Dargwa languages, dialects, peoples, etc., but several terms (Dargwa, Dargva, Dargi, Darginski).


Social, economic and linguistic context

Vera’a is an Oceanic language spoken by a growing population of approximately 500 on the west coast of Vanua Lava in North Vanuatu. The social structure of the community is relatively flat: the community is ruled by a council of chiefs who are elected for different responsibilities every two years. The most significant aspect of social structure is the kin system, which is made up of matrilineal moieties that are arranged into two ‘lines’. The area of North Vanuatu was subject to Christian mission by the Anglican church, and Christianity exists alongside traditional animist believes and practices. Nowadays, other Christian churches, like the Assembly of God or Seventh Day Adventist, have growing communities.

Vera’a economy essentially rests on subsistence horticulture. In addition, people engage in arboculture, fishing and the collection of shellfish as well as some wild root crops, fruits, and ferns. Copra is the cash crop of the community. Modern money is used for importing goods, paying for school and hospital fees, and in some traditional domains like kava consumption, but also as part of bride prices.

Vera’a is a multilingual community, and most speakers of Vera’a also speak Vurës (Malau 2016), spoken in close vicinity of Vera’a on the same island, as well as Bislama, the lingua franca of Vanuatu that has the status of the ‘national language’. More recent immigration from the neighbouring island Mwotlap has resulted in a community of Mwotlap speakers in the north of Vanua Lava, and many Vera’a speakers speak or at least understand Mwotlap. Mota, which was used as the mission language by the Anglican church, is still present in some church contexts. The moribund language Lemerig has only a single last fluent speaker, but is partially remembered by many Vera’a speakers.

Structural features

Vera’a is a mostly configurational language where core grammatical functions are encoded primarily by arrangement of constituents relative to each other, both on clause and noun phrase level. Word order is AVO/SV on clause level and strictly head – modifier on NP level. Clause combining in discourse is mostly paratactic except for tail-head linkage constructions. The major areas of morphosyntactic complexity involve complex verbal predicates and adnominal possessive constructions. The former comprise nuclear-layer serial verb constructions and the use of directional defective verbs, adverbs and particles, and the integration of pronominal expressions into the predicate. The latter involve two types of adnominal construction, so-called direct and indirect possessive constructions. The constructional distinction correspons roughly to the semantic distinction of inalienable versus alienable. Indirect possessive constructions are formed by the use of eight different relational possessive classifiers that characterise the relationship between possessor and possessum, for instance ‘eat’, ‘drink’ or ‘valuable belongings’ possession.

Aspects of social cognition are particularly relevant for the use of predicate-internal event-depictive and stance adverbs (frustrative, empathetic, etc), benefactive constructions involving possessive morphology within the predicate, the virtually categorical use of pronouns for speech-act participant subjects and objects, direct possessive constructions for kin relations, number-marking and dyadic constructions sensitive to social roles.

**More Language Profiles coming soon**