Te Bu passes his MSc thesis exam17 Nov 2015
Te Bu successfully defended his MSc thesis on Nov 17th 2015.
The title of the thesis: “Joint prediction of word alignment and alignment types for statistical machine translation”.
Learning word alignments between parallel sentence pairs is an important task in Statistical Machine Translation. Existing models for word alignment have assumed that word alignment links are untyped. In this work, we propose new machine learning models that use linguistically informed link types to enrich word alignments. We use 11 different alignment link types based on annotated data released by the Linguistics Data Consortium. We first provide a solution to the sub-problem of alignment type prediction given an aligned word pair and then propose two different models to simultaneously predict word alignment and alignment types. Our experimental results show that we can recover alignment link types with an F-score of 81.4%. Our joint model improves the word alignment F-score by 4.6% over a baseline that does not use typed alignment links. We expect typed word alignments to benefit SMT and other NLP tasks that rely on word alignments.
M.Sc. Examining Committee:
- Dr. Anoop Sarkar, Senior Supervisor
- Dr. Fred Popowich, Supervisor
- Dr. Greg Mori, Examiner
- Dr. William (Nick) Sumner, Chair
More information on the Student theses page