SFU Neural MT class: Project

Tensor Flow is a popular open source library for building RNNs (and other useful things).

Final Project

Due on December 8, 2017

The goal for each final project is a submission to a conference at the end of the semester (perhaps with some more work on the paper in the Spring semester).

You must turn in two things:
1. A write-up that explains what you accomplished in your course project. It should be written in a clear, scientific style and contain enough information for another student to replicate your results. More information about the write-up is given below.
2. Your code. The code should be self-contained, self-documenting, and easy to use. Please include a detailed description of how to run your code.
Each group should assign one member to upload the write-up and source code to Coursys as a single tarball or zip files as the submission for Final Project.

Write-up

Your write-up is the most important part of your project. It must have the following sections:

Motivation
- Which aspect of the translation system did your group choose to improve and reasons for your choice.
Approach
- Describe the algorithms and machine learning models used in your project. Use a clear mathematical style to explain your model(s).
Data
- Exactly which data files were used; also include here any external data that was not provided to you.
Code
- Describe which homework code you used in your project. Provide exactly which code was used in your project not written by your group (e.g. use of an aligner from an open-source project).
Experimental Setup
- Describe what kind of evaluation you are doing and which methods you are comparing against each other.
Results
- Include a detailed comparison of different methods.
Analysis of the Results
- Did you improve over the baseline. Why or why not?
Future Work
- What could be fixed in your approach. What you did not have time to finish, but you think would be a useful addition to your project.

Submission Format

For your write-up, you can use plain ASCII but for math equations and tables it is better to use either latex or kramdown. Do not use any proprietary or binary file formats such as Microsoft Word.

If you use latex then use the following style files from the Submission Format section:

Here is an example of using these style files:

To get a PDF file you should run the following commands (which assume you have installed LaTeX on your system):

pdflatex project
bibtex project
pdflatex project
pdflatex project

You can then open project.pdf using any PDF viewer. Please submit the LaTeX source and the PDF file along with your source code when you submit your project to Coursys.

Grading system

The final projects will be graded using the following criteria:

Originality
Substance (amount of work done for the project)
Well documented use of prior results from research papers
Clarity of the writing
Quality of experimental design
Quality of evaluation and results
(Re)Use of existing homework code
Overall score (based on the above criteria)

Examples

Here is an example of good project submissions. Make sure you have the sections required by the Write-up section above.

Example 2 by Dong Song and Anoop Sarkar

Data

NLP class data from Anoop Sarkar’s NLP class.
WMT 2015 data from the Workshop on MT 2015 Shared Task on Translation.
CSLM Paper Data by Holger Schwenk.

Software

lamtram/cnn

lamtram: A toolkit for neural language and translation modeling

Tensor Flow

Tensor Flow: follow the seq2seq tutorial.

nematus/Theano

nematus based on Kyunghun Cho’s dl4mt tutorial.

NMT with Coverage and Context Gating

NMT with Coverage by Zhaopeng Tu built on top of Groundhog and Theano.

Project Ideas

Scaling to larger vocabulary size

More scalable softmax computation.

Different RNNs for NMT

Stack RNN (equivalently LSTM) for NMT. Stack RNNs have been used for transition based dependency parsing. See Goldberg tutorial explanation of stack RNN.
Use a multilingual corpus. Fix alignments using HMM aligner. Use alignments to train RNN. Extract paraphrases using multilingual RNN. (see papers by Callison-Burch and collaborators for extracting paraphases)

Ensemble Models for NMT

Explore different ways to create ensemble models for NMT

Multilingual NMT

Use word embeddings for low-resource languages. Improve performance on a language pair such as Haitian Kreyol to English where there is insufficient data for training a neural MT model.
Pivot based NMT.

Complex reorderings in neural MT

Go beyond the attention model to capture more complex reorderings and experiment on language pairs such as Chinese-English.
Better attention models for complex reordering.

Improve SGD

Use SAG (Schmidt et al, 2015) to improve SGD for RNN-LM or NMT.

Global sequence to sequence learning

Improve upon bidirectional RNNs by exploring global inference and complex non-differentiable loss functions such as BLEU scores. Currently the state of the art is to have a layer on top of a bidirectional sequence model which does global reordering.

An interesting first step might be to take the alignments learned by a (neural) HMM alignment model and use the entire distribution of alignments within a decoder.

Decipherment using neural LMs

Can we improve on computational decipherment methods by using neural LMs and going beyond HMMs for scoring the cipher keys. A more ambitious idea is to map images of text to language models bypassing the possibly error-prone human element in converting the visual script to machine readable text. For instance, scans of the Voynich manuscript are available from the Yale library. Can they be mapped to another language for which we have a language model.

Latent variable RNN-LMs

Latent variable RNN-LMs. See paper by Saul and Pereira.
Use Brown clusters to initialize word vectors for RNN-LM.

Visualization of hidden context vectors

Build a visualizer to enable debugging of NMT systems.