CS 5/662, Winter 2023
This course covers key algorithms and modeling techniques for processing human language sequences, which are needed for applications such as Automatic Speech Recognition and Machine Translation. Both statistical and symbolic approaches to modeling natural language phonology, morphology, and syntax are presented, along with widely used algorithms for efficiently learning and applying different kinds of natural language grammars. There is an emphasis on algorithms and data structures that scale up to handle very large real-world data sets, such as newswire text. The course includes several challenging hands-on programming assignments. Suggested prerequisite: CSE 560 or equivalent. Python programming experience is highly recommended, as is familiarity with regular expressions.
Two of the textbooks we will be using are available electronically from the OHSU Library, and both of the others are available in draft form online.
If possible, I do encourage you to buy a copy of at least the Eisenstein book, so as to support the publishing of high-quality NLP texts.
By the end of the course, students will:
Tuesdays & Thursdays, 15:00 – 16:30, mostly in BICC 124
Starts: January 10
Ends: March 23
Instructor: Steven Bedrick
Office Location: Gaines Hall, 21
Office Hours: Wednesday and Friday 13:00 – 18:00; By appointment otherwise. Note that I will not necessarily be on campus during those times (in which case I will be available via Webex), so before schlepping all the way to Gaines you’ll want to check in with me.
Page last updated: 2023-12-15 10:25