COMP3361: Natural Language Processing

Course Information

Instructor

Lecture

TA

Course Description

Natural language processing (NLP) is the study of human language from a computational perspective. Over the past 20 years, the field of NLP has evolved significantly, primarily driven by advancements in statistical machine learning and deep learning. A notable recent breakthrough is the development of “pre-trained” language models, such as ChatGPT, which have substantially enhanced capabilities across various applications. This course is an introductory undergraduate-level course on natural language processing. In this class, we will cover recent developments on core techniques and modern advances in NLP, especially in the era of large language models. Students will gain the necessary skills and experience to understand, design, implement, and test large language models through lectures and assignments. We will potentially also host invited speakers for talks. 

Prerequisites

Course Materials

There is no required textbook for this course. Speech and Language Processing (3rd ed. draft) by Dan Jurafsky and James H. Martin (2023) and Natural Language Processing by Jacob Eisenstein are recommended if you would like to read more about NLP. Readings from papers, blogs, tutorials, and book chapters will be posted on the course website. Textbook readings are assigned to complement the material discussed in lecture. You may find it useful to do these readings before lecture as preparation or after lecture to review, but you are not expected to know everything discussed in the textbook if it isn’t covered in lecture. Paper readings are intended to supplement the course material if you are interested in diving deeper on particular topics.

Grading

Course Schedule

Date Topic Material Event Due
Week 1
Jan 16
Introduction
[slides]
Readings
Jan 19
Language modeling (n-gram language models)
[slides]
Readings
Week 2
Jan 23
Text classification
[slides]
Readings Others A1 out
Jan 26
Word embeddings 1
[slides]
Readings
Week 3
Jan 30
Word embeddings 2
[slides]
Readings
Feb 2
Neural language models: Overview, tokenization
[slides]
Readings Others
Week 4
Feb 6
Neural language models: RNNs
[slides]
Readings
Feb 9 Other resources
Week 5
Feb 13
No class
Feb 16
No class
Week 6
Feb 20
Neural language models: RNNs and LSTM
[slides]
Readings A2 out A1 due
Feb 23
Neural language models: Transformers
[slides]
Readings
Week 7
Feb 27
Neural language models: Pretraining 1
[slides]
Readings
Mar 1
Neural language models: Pretraining 2
[slides]
Readings Others
Week 8
Mar 5
No class
Mar 8
No class
Week 9
Mar 12
LLM prompting, in-context learning, scaling laws, emergent capacities 1
[slides]
Readings Others
Mar 15
LLM prompting, in-context learning, scaling laws, emergent capacities 2
[slides]
Readings Others
Week 10
Mar 19
Coding tutorial
[slides]
Readings
Mar 22
Natural language generation with LLMs 1
[slides]
Others A2 due
Week 11
Mar 26
Natural language generation with LLMs 2
[slides]
A3 (originally the course project) out
Mar 29
No class
Week 12
Apr 2
Intro to advanced topics
[slides]
Apr 5
Code language models (by Ansong Ni, Yale)
[slides]
Retrieval-augmented LMs (by Weijia Shi, UW)
[slides]
Readings Others
Week 13
Apr 9
LLMs/VLMs as agents
[slides]
Readings
Apr 12
Solving Real-World Tasks with AI Agents (by Shuyan Zhou, CMU)
[slides]
Instruction tuning for LLMs (by Yizhong Wang, UW)
[slides]
Readings Others
Week 14
Apr 16
Efficient LM methods (by Bailin Wang, MIT)
[slides]
Readings
Apr 19
Principles of Reasoning: Designing Compositional and Collaborative Generative AIs (by William Wang, UCSB, on Apr 18)
LLM alignment (by Ruiqi Zhong, UC Berkeley)
[slides]
Readings Others
Week 15
Apr 23
Multimodal language models/VLMs (by Yushi Hu, UW)
[slides]
Readings
Apr 26
Robotics in the era of LLM/VLMs (by Ted Xiao, Google DeepMind)
[slides]
Readings Others
Week 16
Apr 30
No class (Revision)
Course project due