Advanced Methods in NLP Summer 2023
Deep learning for NLP has been successful at improving results on
many NLP tasks by relying on scaling up models, and as a consequence
the models that scale better are the ones that are studied and used
both in research and in industry. This course will be focused on
current advanced methods that are considered state-of-the-art in
NLP. These approaches are built on the Transformer model which gets
rid of difficult-to-train recurrent models in favor of positional
encoding, multi-headed self-attention and various techniques to
allow training multiple feed-forward layers scaling to millions and
even trillions of parameters. This course will focus on the various
aspects of training large models on natural language, fine-tuning
them to specific NLP tasks and then using inference and model
selection to run these models.
The course will focus on the following topics, with an emphasis on
efficient approaches that can be run without an expensive compute
budget:
- Transformer models in NLP
- Data Collection and Preprocessing
- Model design choices
- Pre-training (PLMs - pre-trained language models)
- Fine-tuning PLMs
- Inference
- Model Selection
We will be reading papers on these topics and students will be
presenting the papers and leading the discussion in class in
collaboration with the instructor. The overall goal of the course
will be for each student to produce a project that focuses on
improving efficiency in one or more of these aspects of advanced
methods in NLP.
Educational Goals
At the end of this course, the student should:
- Understand advanced NLP models such as Transformers.
- Be able to write code to implement such models.
- Fine-tune an existing pre-trained language model for a particular
NLP task or tasks.
- Fine-tune an existing pre-trained language model with frozen
parameters.
- Write the inference code necessary to solve an NLP task.
- Understand hyperparameter search.
- Be able to modify the basic model architecture to improve the
state-of-the-art in at least one of the topics covered in this
course.
Instructor
Teaching Assistants
- Hassan Shavarani,
sshavara
, Office hour: Tuesdays 1pm on Zoom (details on discussion board).
- Sha Hu,
hushah
, Office hour: Fridays 5pm on Zoom (details on discussion board).
Asking for help
- Use the web, ChatGPT and Codex
- Ask for help on the Coursys forum
- Instructor office hours: Thursdays, 9:30am on Zoom (link on discussion forum).
- No emails to the TAs and only emails that cannot go to discussion board can be sent to the instructor
- Use only SFU email address and use
cmpt419:
or cmpt983:
(for ugrad and grad respectively) as subject prefix
Time and place
- Mon 10:30-12:20 AQ 3153 (room change from the original: WMC 3210)
- Wed 10:30-11:20 WMC 3260
- Last day of classes: Aug 4, 2023
Calendar
Textbook
- No required textbook. Online readings provided in Syllabus.
Grading
- Submit homework source code and check your grades on Coursys
- Programming setup homework: HW0 due on May 29, 2023 (2%)
- Four programming homeworks. Due dates: HW1 on June 5, 2023, HW2 on June 26, 2023, HW3 on July 19, 2023, HW4 on July 31, 2023 (12% each)
- Participation: Helping other students on the discussion board in a positive way (5%)
- In class presentation (5%)
- Final Project Proposal: Due on July 24, 2023 (10%)
- Final Project: Due on Aug 10, 2023 (30%)
- Final Project Poster Session:
- Time: Aug 10, 2023 Poster session:.
- Location: ASB Atrium (tentative)