Hide metadata

dc.date.accessioned2014-12-01T15:19:45Z
dc.date.available2014-12-01T15:19:45Z
dc.date.created2013-05-25T23:28:54Z
dc.date.issued2013
dc.identifier.urihttp://hdl.handle.net/10852/41688
dc.description.abstractDespite the existence of effective methods that solve named entity recognition tasks for such widely used languages as English, there is no clear answer which methods are the most suitable for languages that are substantially different. In this paper we attempt to solve a named entity recognition task for Lithuanian, using a supervised machine learning approach and exploring different sets of features in terms of orthographic and grammatical information, different windows, etc. Although the performance is significantly higher when language dependent features based on gazetteer lookup and automatic grammatical tools (part-of-speech tagger, lemmatizer or stemmer) are taken into account; we demonstrate that the performance does not degrade when features based on grammatical tools are replaced with affix information only. The best results (micro-averaged F-score=0.895) were obtained using all available features, but the results decreased by only 0.002 when features based on grammatical tools were omitted. Jurgita Kapociute-Dzikiene, Anders Nøklestad, Janne Bondi Johannessen, Algis Krupavicius (2013). Exploring Features for Named Entity Recognition in Lithuanian Text Corpus, Proceedings of the 19th Nordic Conference of Computational Linguistics (NODALIDA 2013); May 22–24; 2013; Oslo University; Norway. NEALT Proceedings Series 16 http://www.ep.liu.se/ecp_article/index.en.aspx?issue=085;article=011en_US
dc.languageEN
dc.language.isoenen_US
dc.titleExploring Features for Named Entity Recognition in Lithuanian Text Corpusen_US
dc.typeChapteren_US
dc.creator.authorKapociute-Dzikiene, Jurgita
dc.creator.authorNøklestad, Anders
dc.creator.authorJohannessen, Janne Bondi
dc.creator.authorKrupavicius, Algis
cristin.unitcode185,14,35,0
cristin.unitnameInstitutt for lingvistiske og nordiske studier
cristin.ispublishedtrue
cristin.fulltextoriginal
dc.identifier.cristin1030441
dc.identifier.startpage73
dc.identifier.endpage88
dc.identifier.urnURN:NBN:no-46155
dc.type.documentBokkapittelen_US
dc.type.peerreviewedPeer reviewed
dc.identifier.fulltextFulltext https://www.duo.uio.no/bitstream/handle/10852/41688/1/ecp1385011.pdf
dc.type.versionPublishedVersion
cristin.btitleProceedings of the 19th Nordic Conference of Computational Linguistics (NODALIDA 2013)


Files in this item

Appears in the following Collection

Hide metadata