Abstract
This thesis presents a systematic, empirical investigation of how an existing PoS tag set can be modified and optimized for the task of syntactic dependency parsing of Norwegian. The tag set of the Norwegian Dependency Treebank is modified and optimized through experiments with the morphological features in the treebank. The experiments are complemented by evaluation of a range of state-of- the-art PoS taggers and syntactic parsers applied to Norwegian. The results of our work are concrete contributions to the Norwegian NLP community: (i) a data set split (training/development/test) of the Norwegian Dependency Treebank; (ii) a PoS tag set optimized for syntactic dependency parsing of Norwegian; (iii) a PoS tagger model trained on the treebank; and (iv) a syntactic parser model trained on the treebank.