Paper accepted at EMNLP 2015
13 Aug 2015The list of accepted papers in EMNLP 2015 is now available at EMNLP 2015 accepted papers.
We had the following long paper accepted at EMNLP 2015.
Title: Improving Statistical Machine Translation with a Multilingual Paraphrase Database
Authors: Ramtin Mehdizadeh Seraj; Maryam Siahbani; Anoop Sarkar
Abstract:
The multilingual Paraphrase Database (PPDB) is a freely available automatically created resource of paraphrases in multiple languages. In statistical machine translation, paraphrases can be used to provide translation for out-of-vocabulary (OOV) phrases. In this paper, we show that a graph propagation approach that uses PPDB paraphrases can be used to improve overall translation quality. We provide an extensive comparison with previous work and show that our PPDB-based method improves the BLEU score by up to 1.79 percent points. We show that our approach improves on the state of the art in three different settings: when faced with limited amount of parallel training data; a domain shift between training and test data; and handling a morphologically complex source language. Our PPDB-based method outperforms the use of distributional profiles from monolingual source data.
Out of 1315 valid submissions, only 312 were accepted, which gives an acceptance rate of 24% for EMNLP 2015.