COMP3361: Natural Language Processing

Course Information

Instructor

Lecture

TA

Course Description

Natural language processing (NLP) is the study of human language from a computational perspective. Over the past 20 years, the field of NLP has evolved significantly, primarily driven by advancements in statistical machine learning and deep learning. A notable recent breakthrough is the development of “pre-trained” language models, such as ChatGPT, which have substantially enhanced capabilities across various applications. This course is an introductory undergraduate-level course on natural language processing. In this class, we will cover recent developments on core techniques and modern advances in NLP, especially in the era of large language models. Students will gain the necessary skills and experience to understand, design, implement, and test large language models through lectures and assignments. We will potentially also host invited speakers for talks. 

Prerequisites

Course Materials

There is no required textbook for this course. Speech and Language Processing (3rd ed. draft) by Dan Jurafsky and James H. Martin (2023) and Natural Language Processing by Jacob Eisenstein are recommended if you would like to read more about NLP. Readings from papers, blogs, tutorials, and book chapters will be posted on the course website. Textbook readings are assigned to complement the material discussed in lecture. You may find it useful to do these readings before lecture as preparation or after lecture to review, but you are not expected to know everything discussed in the textbook if it isn’t covered in lecture. Paper readings are intended to supplement the course material if you are interested in diving deeper on particular topics.

Grading

Course Schedule

Date Topic Material Event Due
Week 1
Jan 21
Introduction
[slides]
Readings
Jan 24
Language modeling (n-gram language models) 1
[slides]
Readings
Week 2
Jan 28
Other resources
Jan 31
No class
Week 3
Feb 4
No class
Feb 7
Language modeling (n-gram language models) 2
[slides]
Readings A1 out
Week 4
Feb 11
Text classification
[slides]
Readings Others
Feb 14
Word embeddings 1
[slides]
Readings
Week 5
Feb 18
Word embeddings 2
[slides]
Readings
Feb 21
Neural language models: Overview
[slides]
Readings
Week 6
Feb 25
Neural language models: Tokenization
Readings
Feb 28
Week 7
Mar 4
A2 out A1 due
Mar 7
Week 8
Mar 11
No class
Mar 14
No class
Week 9
Mar 18
Mar 21
Week 10
Mar 25
Mar 28
Week 11
Apr 1
Apr 4
No class
A3 out A2 due
Week 12
Apr 8
Apr 11
Week 13
Apr 15
Apr 18
No class
Week 14
Apr 22
Apr 25
Week 15
Apr 29
May 2
Week 16
May 6
No class
A3 due