Uso de información morfológica en el alineamiento español-euskera

  1. Agirre Bengoa, Eneko
  2. Díaz de Ilarraza Sánchez, Arantza
  3. Labaka Intxauspe, Gorka
  4. Sarasola Gabiola, Kepa
Journal:
Procesamiento del lenguaje natural

ISSN: 1135-5948

Year of publication: 2006

Issue: 37

Pages: 257-266

Type: Article

More publications in: Procesamiento del lenguaje natural

Abstract

In this paper we present a preliminary study for the alignment of a Spanish-Basque parallel corpus using a token-based aligner (GIZA++).We have studied several morphological pre-processing alternatives, and achieved 23.76% Alignment Error Rate, with a reduction of 12.48% over the baseline (no pre-processing). The results are comparable to those obtained for others agglutinative languages.