Preskoči na vsebino

Resources and tools

Parliamentary Corpus of first Yugoslavia (1919-1939) yu1Parl 1.0

The yu1Parl 1.0 corpus includes stenographic transcripts of the sessions of the national representation of the First Yugoslavia from 1919 to 1939. This includes the Temporary National Representation of the Kingdom of Serbs, Croats, and Slovenes (1919–1920), the Legislative Committee of the National Assembly of the Kingdom of Serbs, Croats, and Slovenes (1921–1922), and the National Representation of the Kingdom of Yugoslavia (1931–1939), which covers sessions of both the National Assembly and the Senate. A longer period of transcripts from 1923 to 1928 is missing, as they were not available in digital form at the time the corpus was created. The corpus was compiled from scanned printed transcripts published as PDF documents on the SIstory portal. It consists of facsimiles of the transcripts in PDF format and corresponding machine-readable XML documents in the Parla-CLARIN TEI format with metadata annotations, including morphosyntactic tagging and lemmatization. The corpus contains 714 session transcripts, totalling 15,403 pages. All transcripts are multilingual: Slovene (3% of sentences) and Serbo-Croatian, written in both Cyrillic (59% of sentences) and Latin script (38% of sentences), and contain approximately 13 million words in speeches. The corpus is available in the CLARIN.SI repository, with links to the concordancers noSketch Engine and KonText, and in the web application ParlaVis.