

Ivana Lucica 3, 10000 Zagreb, CROATIA
tel. (+385 1) 6120-011, 6120-142; fax. (+385 1) 6156-879
e-mail: zzl@ffzg.hr
CROATIAN NATIONAL CORPUS
The Institute of linguistics is one of the organizational units of the
The Institute is the center of most of the linguistic-oriented projects at the Facultyof Philosophy.
This document covers several topics on the Institute of linguistics:
The Institute of Linguistics was founded in 1960 upon the suggestion of fivedistinguished professors of the Faculty of Philosophy (Mirko Deanovic, Rudolf Filipovic,Vladimir Gortan, Josip Hamm and Vojmir Vinja). The initial purpose of such an institutionwas to obtain more efficient organization of linguistic research at the Faculty. The firstdirector of the Institute was professor Ljudevit Jonke (1960-1963), followed by professorRudolf Filipovic (1963-1983), professor Milan Mogus (1983-1992), Dr. Maja Bratanic(1992-1994) and Dr. Vesna Muhvic-Dimanovski (1994.-).
The general aims of the Institute defined from its beginning have been:
In the 1960-s the main activity of the Institute was predominantly oriented towardsCroatian in contact with other languages. In the 1970-s the project of compiling theCroatian language corpora started. An overview of projects (concluded and current) can bedivided in three areas:
Two areas of computational linguistics have been covered by projects:
1 million corpus of
running text in Croatian (M-corpus) covering the 1937-1978 period
and compiled by Prof. Milan Mogus.
The corpus is divided in 5 sub-corpora (prose, poetry, drama, secondary school textbooks,
newspapers; 200.000 tokens each)
The processing includes:
30 million corpus of running text in Croatian (see current projects)
see also: Croatian Corpus Processing: State-of-art and Perspectives doc ps pdf
Computational model of the
Croatian flective system
has been designed and prototyped. As the result of that research the
Morphological generator of Croatian word-forms has been developed.
At this moment (fall 1996) the filling of the lexicon is on its way. The result of
generation will be list of word-forms accompanied with grammatical categories that can be
used for semi-automatic lemmatization and corpus tagging.
The areas which are covered by Institute contrastive projects are:
The method of compiling the material and processing the corpus has been tested on the pilot-research of the one-million corpus of Croatian worked on in this institution within the frame of a previous project financed by MZT (6-03-048). The influence of the project can be expected in the following areas:
The Institute has developed a rather strong publishing activity which encompassesseveral series of publications such as:
More than 60 books and studies have been published as the result of the work on
theInstitute projects.
The Institute is also co-publisher of the journal Suvremena lingvistika(Contemporary
linguistics) which is quoted in the MLA and BL. It also publishes the Bulletinof the
Institute of Linguistics where thorough bibliographical data as well asarticles on
ongoing projects are published.
Vesna Muhvic-Dimanovski, Ph.D..
Marko Tadic, Ph.D.
Ph.D. in computational linguistics; field of interest: corpus linguistics,
corporacompiling and processing, computational morphology
Ida Raffaelli, M.A.
M.A. in linguistics; field of interest: semantics in historical perspective,
mediaevalFrench, chronicles
Marica Cilas
B.A. in Croatian language and literature; field of interest: Croatian syntax
Bosko Bekavac
B.A. in linguistics and informational sciences; field of interest:
computationallinguistics, corpus linguistics
Milena Zic-Fuchs, Ph.D.
Zrinjka Glovacki-Bernardi, Ph.D.
Ph.D. in linguistics, field of interest: discourse analysis, languages in contact:
Germanand Croatian; principal researcher of the project: Croatian-German
linguisticrelations.
Professor Dubravka Sesar
Ph.D. in linguistics, field of interest: Slavic languages, language
standardizationprocesses, Czech and Slovak; principal researcher of the project: The
analysis ofWest-Slavic languages.
Professor Miro Kacic
Ph.D. in linguistics, field of interest: theoretical linguistics, mathematicallinguistics,
NLP, Croatian and French; principal researcher of the project: Descriptivegrammar of
the French with special reference to Croatian.
Back to Faculty of Philosophy Homepage
Last change 1997-10-07 by Marko Tadic