COMP3361: Natural Language Processing

Course Information

Instructor

Tao Yu
Office Hour: Tuesday 4 - 5 pm @HW518 (make an appointment)

Lecture

Tuesday 9:30 am - 10:20 am and Friday 9:30 am - 11:20 am @CPD-G.02
Course syllabus

Yiheng Xu (Office Hour: Thursday 1 - 2 pm. Book here)

Course Description

Natural language processing (NLP) is the study of human language from a computational perspective. Over the past 20 years, the field of NLP has evolved significantly, primarily driven by advancements in statistical machine learning and deep learning. A notable recent breakthrough is the development of “pre-trained” language models, such as ChatGPT, which have substantially enhanced capabilities across various applications. This course is an introductory undergraduate-level course on natural language processing. In this class, we will cover recent developments on core techniques and modern advances in NLP, especially in the era of large language models. Students will gain the necessary skills and experience to understand, design, implement, and test large language models through lectures and assignments. We will potentially also host invited speakers for talks. 

Prerequisites

COMP3314 or COMP3340; and MATH1853
We require students to have prior knowledge undergraduate linear algebra, probability and statistics, machine learning, or deep learning. Familiarity with Python programming is required.

Course Materials

There is no required textbook for this course. Speech and Language Processing (3rd ed. draft) by Dan Jurafsky and James H. Martin (2023) and Natural Language Processing by Jacob Eisenstein are recommended if you would like to read more about NLP. Readings from papers, blogs, tutorials, and book chapters will be posted on the course website. Textbook readings are assigned to complement the material discussed in lecture. You may find it useful to do these readings before lecture as preparation or after lecture to review, but you are not expected to know everything discussed in the textbook if it isn’t covered in lecture. Paper readings are intended to supplement the course material if you are interested in diving deeper on particular topics.

Grading

Assignments: 40% (You will be given a total of 5 slip days to use for assignments/projects)
- Assignment 1 (A1): 20%
- Assignment 2 (A2): 20%
Course project or Assignment 3 (A3): 30%
In-class final exam: 25%
Class participation: 5%
- Filling out course feedback surveys
- Participating in group discussions on Slack or in the class
- Attending potential guest speakers’ lectures, and so on

Course Schedule

Date	Topic	Material	Event	Due
Week 1 Jan 21	Introduction [slides]	Readings Human Language Understanding & Reasoning DeepSeek-R1 Tech Report (quick skim only)
Jan 24	Language modeling (n-gram language models) 1 [slides]	Readings M. Collins, Notes 1 J&M Ch. 7
Week 2 Jan 28	Tutorial on PyTorch and Transformers [slides]	Other resources Stanford CS224N PyTorch Tutorial HuggingFace NLP Course Transformers Notebooks
Jan 31	No class
Week 3 Feb 4	No class
Feb 7	Language modeling (n-gram language models) 2 [slides]	Readings M. Collins, Notes 1 J&M Ch. 7	A1 out
Week 4 Feb 11	Text classification [slides]	Readings J&M Ch. 4.1-4.6 J&M Ch. 5.1-5.8 M. Collins, Notes 2 J&M Ch. 6 Others Working with Google Colab
Feb 14	Word embeddings 1 [slides]	Readings J & M 6.2-6.4, 6.6 word2vec paper GloVe paper
Week 5 Feb 18	Word embeddings 2 [slides]	Readings J & M 6.8, 6.10-6.12
Feb 21	Neural language models: Overview [slides]	Readings J & M 7.3-7.5 Running example of using BERT
Week 6 Feb 25	Neural language models: Tokenization [slides]	Readings Byte-pair encoding paper Talk on the GPT Tokenizer by Andrej Karpathy Summary of the tokenizers with HuggingFace
Feb 28	Neural language models: Tokenization and RNNs [slides]	Readings J & M 7.3-7.5 The Unreasonable Effectiveness of Recurrent Neural Networks
Week 7 Mar 4	Neural language models: RNNs and LSTM [slides]	Readings J & M 9.1-9.3, 9.5 Understanding LSTM Networks	A2 out	A1 due
Mar 7	Neural language models: Transformers 1 [slides]	Readings J & M 10.1 Attention Is All You Need The Annotated Transformer The Illustrated Transformer
Week 8 Mar 11	No class
Mar 14	No class
Week 9 Mar 18	Neural language models: Transformers 2 [slides]	Readings J & M 10.1 Attention Is All You Need The Annotated Transformer The Illustrated Transformer
Mar 21	Neural language models: Pretraining 1 [slides]	Readings BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding The Illustrated BERT, ELMo, and co. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer (T5) Language Models are Few-Shot Learners (GPT-3)
Week 10 Mar 25	Neural language models: Pretraining 2 [slides]	Readings Colab on building GPT: from scratch, in code, spelled out by Andrej Karpathy The Illustrated GPT-2
Mar 28	Coding tutorial 2 [slides] LLM prompting, in-context learning, scaling laws [slides]	Readings Chain-of-Thought Prompting Elicits Reasoning in Large Language Models Scaling Laws for Neural Language Models The Annotated Transformer Talk on building GPT: from scratch, in code, spelled out by Andrej Karpathy
Week 11 Apr 1	Natural language generation with LLMs 1 [slides]	Others The Curious Case of Neural Text Degeneration		A2 due
Apr 4	No class		A3 out
Week 12 Apr 8	Natural language generation with LLMs 2 [slides]
Apr 11	LLM/VLM benchmarks and evaluation [slides]
Week 13 Apr 15	Intro to advanced topics [slides]
Apr 18	No class
Week 14 Apr 22	Coding tutorial 3 [slides]
Apr 25	Learning to Reason in Agents (by Zhiqing Sun, OpenAI) [slides] Multimodal language models/VLMs (by Chen Wu, CMU) [slides]	Readings BrowseComp: A Simple Yet Challenging Benchmark for Browsing Agents Lean-STaR: Learning to Interleave Thinking and Proving Easy-to-Hard Generalization: Scalable Alignment Beyond Human Supervision Qwen2.5-VL Technical Report Learning Transferable Visual Models From Natural Language Supervision Others Introducing deep research - OpenAI SDXL: Improving Latent Diffusion Models for High-Resolution Image Synthesis
Week 15 Apr 29	Efficient LM methods (by Bailin Wang, Apple) [slides]	Readings LoRA: Low-Rank Adaptation of Large Language Models FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness
May 2	Scaling LLMs via RL - Scaling Language Models via Reinforcement Learning: From Reasoning to Agents (by Jiayi Pan, UC Berkeley) [slides]	Readings DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning Training Software Engineering Agents and Verifiers with SWE-Gym Autonomous Evaluation and Refinement of Digital Agents Welcome to the Era of Experience verl: Volcano Engine Reinforcement Learning for LLMs
Week 16 May 6	No class
May 9	No class			A3 due