Towards large-scale language analysis in the cloud
Appears in the following Collection
Original versionProceedings of the workshop on Nordic language research infrastructure at NODALIDA 2013. 2013, 1-10
AbstractThis paper documents ongoing work within the Norwegian CLARINO project on building a Language Analysis Portal (LAP). The portal will provide an intuitive and easily accessible web interface to a centralized repository of a wide range of language technology tools, all installed on a high-performance computing cluster. Users will be able to compose and run workflows using an easy-to-use graphical interface, with multiple tools and resources chained together in potentially complex pipelines. Although the project aims to reach out to a diverse set of user groups, it particularly will facilitate use of language analysis in the social sciences, humanities, and other fields without strong computational traditions. While the development of the portal is still in its early stages, this paper documents ongoing work towards an already operable pilot in addition to providing an overview of long-term goals and visions. At the core of the current pilot implementation we find Galaxy, a web-based workflow management system initially developed for data-intensive research in genomics and bioinformatics; therefore, an important part of the work on the pilot is to adapt and evaluate Galaxy for the context of a language analysis portal.
Emanuele Lapponi, Erik Velldal, Nikolay A. Vazov, Stephan Oepen (2013). Towards Large-Scale Language Analysis in the Cloud, Proceedings of the workshop on Nordic language research infrastructure at NODALIDA 2013, May 22-24, 2013, Oslo, Norway. NEALT Proceedings Series 20 http://www.ep.liu.se/ecp_article/index.en.aspx?issue=089;article=001