CPSC 65/LING 20: Fall 2020

Schedule | Grading | Labs | Policies | Helpful Links


This course will introduce you to a broad range of topics in the area of natural language processing including language modeling, part of speech tagging, syntactic parsing, vector semantics, text classification, machine translation, and applications to cognitive modeling and psycholinguistics.

Course Goals

By the end of the course you will:

Class Information

Professor: Spencer Caplan
Office: Science Center 258
Phone: (610) 328-8272
Office Hours: T 3:30 -- 4:30pm; T 6:30 -- 7:30pm; W 9:00 -- 10:00am; or by appointment
OH location: Zoom

Lecture time: Tuesday and Thursday 2:00 -- 3:15pm
Lecture location: Zoom

Lab time: Wednesday 1:15 -- 2:45pm and 3:00 -- 4:30pm
Lab location: Start on Zoom and move to Slack

Class discussion board: Piazza

Note to enrolled students: please don't hesitate to contact Prof. Caplan if you are having trouble accessing any of the course resources (Slack, Piazza, etc.)


You should not purchase any textbooks this semester. Assigned readings will be posted to the course website and come primarily from:

Tentative schedule


Sep 08



  • History of NLP
  • Unix tools
  • regular Expressions
  • tokenization and normalization

Required Reading

Lecture Slides


Lab 1

Sep 10

Add/Drop ends (Sep 14)


Sep 15

Lab 1 (Counts) due (Sep 16)

Language Modeling

  • probability
  • n-grams
  • smoothing
  • train-dev-test

Required Reading

Lecture Slides


Lab 2
(Lang Mod)

Sep 17


Sep 22


Noisy Channel

  • edit Distance
  • spelling correction
  • phonology and speech processing
  • cognitive modeling

Required Reading

Lecture Slides


Sep 24


Sep 29

Lab 2 (Lang Mod) due (Sep 30)

Vector Semantics

  • lexical semantics
  • co-occurence matrices
  • vector comparison
  • re-weighting embeddings
  • sparse and dense embeddings (word2vec)

Required Reading

Lecture Slides


Lab 3
(Word Vectors)

Oct 01


Oct 06

Lab 3 (Word Vectors) due (Oct 07)

POS Tagging

  • word classes and POS
  • HMMs
  • Viterbi

Required Reading

Lecture Slides

Lab 4

Oct 08


Oct 13



  • naive bayes
  • logistic regression
  • perceptrons and SVMs
  • neural networks

Required Reading


Lecture Slides

Oct 15


Oct 20

Lab 4 (POS) due (Oct 21)

Lab 5
(Spam Filter)

Oct 22



  • constituency structure
  • CFGs and treebanks
  • top-down, bottom-up and CKY parsing
  • PCFGs and statistical parsing

Required Reading

Lecture Slides


Oct 27

Lab 5 (Spam Filter) due (Oct 28)

Lab 6

Oct 29


Nov 03

Election Day (no class, go vote)

Nov 05

Lab 6 (Parsing) due

CR/NC/W Deadline (Nov 06)


  • Lexicalized parsing and reranking

Lecture Slides

Final Projects


Nov 10


Machine Translation

  • history
  • word-to-word and noisy-channel approaches
  • evaluation: BLEU

Required Reading


Lecture Slides

Nov 12


Guest Lecture:
Stephen Mayhew (Duolingo)

  • NLP at Duolingo

Lecture Slides


Nov 17


Machine Translation

  • EM-algorithm
  • phrasal-MT
  • decoding

Lecture Slides

Nov 19

Last "on campus" day (Nov 20)

Guest Lecture:
Jordan Kodner (Stony Brook)

  • perspectives on Deep Learning in NLP

Lecture Slides

Optional Reading


Nov 24


Nov 26


Dec 01



  • finite-state machines
  • vowel harmony
  • tolerance principle
  • wrap up!

Lecture Slides

Dec 03

Last day of lectures (Dec 04)


Dec 08

Final Presentations

Dec 10

Dec 15

Absolute last day to hand in final projects


Your overall grade in the course will be determined as follows:

70% Labs and projects
25% Final project / paper
5% Participation and Attendance

Labs and Projects

This course features regular lab projects that account for the largest component of your course grade. Lab attendance is expected by all students, unless you have already completed and submitted the lab assignment for the week.

Lab assignments will typically be assigned during the lab sections on Wednesday and will generally be due by noon on Wednesday of the following week (or two weeks for some labs). Many of the labs involve substantial programming and development, so you are strongly encouraged to start early. Do not underestimate how long it may take you to complete the lab projects!

Even if you do not fully complete an assignment, you should submit what you have done to receive partial credit.

Programming language

Assignments will presuppose knowledge of python3. You may end up learning some bash scripting and other tools along the way, but you are not expected to know this ahead of time.

Please make sure that each program you turn in has:

I expect that you will be using python3 for all assignments. If you would like to use something different, you need to ask me about it prior to starting the lab.

Remote Collaboration

Since CS65 is operating remotely and most students will not have access to the department comptuer labs, you will need to coordinate some setup for remote collaboration between you and your lab partner / team-members. I am not requiring any particular technology for remote collaboration, so feel free to use whatever configuration works best for everyone involved, although I link to some potential resources below below.

Pair Programming

Students will work together with a pair-programming partner on all lab assignments. You may choose your own programming partner, but you are strongly encouraged to form a partnership with someone in the same or a nearby timezone as you. If you opt out of selecting your own parner, Prof. Caplan will assign a partner to you via random script. Partnerships are a per-assignment commitment. So you can switch partners between labs if you choose to do so. Grades for each project and assigned to the partnership as a whole (i.e. each students gets the same grade for work submitted together)

When working in a programming partnership, you should follow these guidelines:

International Student Lab Accommodations

While the general expectation is that students participate synchronously in lectures and labs, I acknowledge that some students are in significantly different timezones from Swarthmore. If your lab section time occurs between 11pm and 8am in your local timezone, you may alternatively attend an additional block of office hours (Wednesdays 9:00 -- 10:00am) designed to accomodate such timezone issues. If you believe this accomodation applies to you, you need to contact Prof. Caplan.


Assignment Submissions

You must submit your assignments electronically by pushing to your assigned git repository. You may push your assignment multiple times, and a history of previous submissions will be saved. You are encouraged to push your work regularly.

Extension and Late-Day Policy

To help with cases of minor illnesses, network issues, or other short-term time limitations, all students start the course with three “late assignment days” to be used at your discretion, with no questions asked. To use your extra time, you must email Prof. Caplan after you have completed the lab and pushed to your repository. You do not need to inform anyone ahead of time. When you use late time, you should still expect to work on the newly-released lab during the following lab section meeting. The professor will always prioritize answering questions related to the current lab assignment. Please be careful in how you use your late-days! Once you use them all up further extensions cannot be granted, even for issues that are not your fault.

Your late days will be counted at the granularity of full days and will be tracked on a per-student (NOT per-partnership) basis. That is, if you turn in an assignment five minutes after the deadline, it counts as using one day. Using a late day counts towards the late days of each student in the programming partnership. In the case in which only one partner has unused late days, that partner’s late days may be used, barring a consistent pattern of abuse (as determined by the professor).

If you feel that you need an extension on an assignment or that you are unable to attend class for two or more meetings due to a medical condition (e.g., extended illness, concussion, hospitalization) or other emergency, you must contact the dean’s office and your instructors. Faculty will coordinate with the deans to determine and provide the appropriate accommodations.

Academic Integrity

Academic honesty is required in all your work. Under no circumstances may you hand in work done with (or by) someone else under your own name. Your code should never be shared with anyone; you may not examine or use code belonging to someone else, nor may you let anyone else look at or make a copy of your code. This includes, but is not limited to, obtaining solutions from students who previously took the course or code that can be found online. You may not share solutions after the due date of the assignment or make them publicly available anywhere (e.g. public GitHub repository).

Discussing ideas and approaches to problems with others on a general level is fine (in fact, we encourage you to discuss general strategies with each other), but you should never read anyone else’s code or let anyone else read your code. All code you submit must be your own with the following permissible exceptions: code distributed in class, code found in the course text book, and code worked on with an assigned partner. In these cases, you should always include detailed comments that indicates on which parts of the assignment you received help, and what your sources were.

Failure to abide by these rules constitutes academic dishonesty and will lead to a hearing of the College Judiciary Committee. According to the Faculty Handbook: “Because plagiarism is considered to be so serious a transgression, it is the opinion of the faculty that for the first offense, failure in the course and, as appropriate, suspension for a semester or deprivation of the degree in that year is suitable; for a second offense, the penalty should normally be expulsion.”

The spirit of this policy applies to all course work, including code, homework solutions (e.g., proofs, analysis, written reports), and exams. Please contact me if you have any questions about what is permissible in this course.

Academic Accommodations

If you believe you need accommodations for a disability or a chronic medical condition, please contact Student Disability Services (Parrish 113W, 123W) via e-mail at studentdisabilityservices@swarthmore.edu to arrange an appointment to discuss your needs. As appropriate, the office will issue students with documented disabilities or medical conditions a formal Accommodations Letter. Since accommodations require early planning and are not retroactive, please contact Student Disability Services as soon as possible. For details about the accommodations process, visit the Student Disability Services website.

You are also welcome to contact me privately to discuss your academic needs. However, all disability-related accommodations must be arranged, in advance, through Student Disability Services.

CS65 Home Page

Last updated: Sunday, October 11, 2020