Construcción de un corpus etiquetado sintácticamente para el euskera
- Aduriz, Itziar
- Aldezabal Roteta, Izaskun
- Aranzabe Urruzola, María Jesús
- Arrieta Kortajarena, Bertol
- Arriola Egurrola, José María
- Atutxa Salazar, Aitziber
- Díaz de Ilarraza Sánchez, Arantza
- Gojenola Galletebeitia, Koldobika
- Oronoz Anchordoqui, Maite
- Sarasola Gabiola, Kepa
ISSN: 1135-5948
Year of publication: 2002
Issue Title: XVII Congreso de la SEPLN. Universidad de Valladolid, 11-13 septiembre 2002
Issue: 29
Pages: 5-11
Type: Article
More publications in: Procesamiento del lenguaje natural
Abstract
The aim of this work is the construction of a syntactically annotated treebank for Basque. In this paper we present first, the basis of the annotation. After examining several options we chose the scheme presented in (Carrol et al., 1998). It follows the EAGLES standards and it is based on the idea of adding to each sentence in the corpus a series of grammatical relations specifying the dependencies between modifiers and their nucleus. After the formalism has been presented, we will describe the problems we have found and the decisions we have taken to solve them. Next we present an example showing the application of the scheme to an initial corpus. Finally, we present the main conclusions about the applicability to Basque and future work.