Skip to main content

Advanced Methods in NLP Summer 2023

Deep learning for NLP has been successful at improving results on many NLP tasks by relying on scaling up models, and as a consequence the models that scale better are the ones that are studied and used both in research and in industry. This course will be focused on current advanced methods that are considered state-of-the-art in NLP. These approaches are built on the Transformer model which gets rid of difficult-to-train recurrent models in favor of positional encoding, multi-headed self-attention and various techniques to allow training multiple feed-forward layers scaling to millions and even trillions of parameters. This course will focus on the various aspects of training large models on natural language, fine-tuning them to specific NLP tasks and then using inference and model selection to run these models.

The course will focus on the following topics, with an emphasis on efficient approaches that can be run without an expensive compute budget:

  1. Transformer models in NLP
  2. Data Collection and Preprocessing
  3. Model design choices
  4. Pre-training (PLMs - pre-trained language models)
  5. Fine-tuning PLMs
  6. Inference
  7. Model Selection

We will be reading papers on these topics and students will be presenting the papers and leading the discussion in class in collaboration with the instructor. The overall goal of the course will be for each student to produce a project that focuses on improving efficiency in one or more of these aspects of advanced methods in NLP.

Educational Goals

At the end of this course, the student should:

  1. Understand advanced NLP models such as Transformers.
  2. Be able to write code to implement such models.
  3. Fine-tune an existing pre-trained language model for a particular NLP task or tasks.
  4. Fine-tune an existing pre-trained language model with frozen parameters.
  5. Write the inference code necessary to solve an NLP task.
  6. Understand hyperparameter search.
  7. Be able to modify the basic model architecture to improve the state-of-the-art in at least one of the topics covered in this course.

Instructor

Teaching Assistants

  • Hassan Shavarani, sshavara, Office hour: Tuesdays 1pm on Zoom (details on discussion board).
  • Sha Hu, hushah, Office hour: Fridays 5pm on Zoom (details on discussion board).

Asking for help

  • Use the web, ChatGPT and Codex
  • Ask for help on the Coursys forum
  • Instructor office hours: Thursdays, 9:30am on Zoom (link on discussion forum).
  • No emails to the TAs and only emails that cannot go to discussion board can be sent to the instructor
  • Use only SFU email address and use cmpt419: or cmpt983: (for ugrad and grad respectively) as subject prefix

Time and place

  • Mon 10:30-12:20 AQ 3153 (room change from the original: WMC 3210)
  • Wed 10:30-11:20 WMC 3260
  • Last day of classes: Aug 4, 2023

Calendar

Textbook

  • No required textbook. Online readings provided in Syllabus.

Grading

  • Submit homework source code and check your grades on Coursys
  • Programming setup homework: HW0 due on May 29, 2023 (2%)
  • Four programming homeworks. Due dates: HW1 on June 5, 2023, HW2 on June 26, 2023, HW3 on July 19, 2023, HW4 on July 31, 2023 (12% each)
  • Participation: Helping other students on the discussion board in a positive way (5%)
  • In class presentation (5%)
  • Final Project Proposal: Due on July 24, 2023 (10%)
  • Final Project: Due on Aug 10, 2023 (30%)
  • Final Project Poster Session:
    • Time: Aug 10, 2023 Poster session:.
    • Location: ASB Atrium (tentative)