Na-Rae Han's home page
Took the course in the past? Click here for 2017, for 2016, 2014, and 2013. (E-mail Na-Rae for password.)

LING 1330/2330 Computational Linguistics

Fall 2018, University of Pittsburgh

Meetings: Tue & Thu 4:30pm - 5:45pm   Classroom: 363 Cathedral of Learning

Description

This is a course designed to introduce students who have been exposed to linguistics to real-world applications of computational linguistics. The students will first learn the fundamentals of how computers are used to represent and process textual and spoken information. They will then be introduced to the challenges of real-world language engineering problems and learn how they are handled with the latest language technologies. The topics include: spell-checking, machine translation, part-of-speech tagging, parsing, document classification, and corpus building and exploration. Students will be given hands-on training on the basics of text processing using Python and will have a chance to work with NLTK, a popular natural language processing application suite. This course is designed specifically for students in the humanities; computer science majors (who are not linguists) are encouraged to take CS 1671 or CS 1571 instead.

Prerequisites

LING 1000 Introduction to Linguistics is the only prerequisite for this course. Prior knowledge of Python or other programming languages is not required but highly recommended. CS 0008 "Introduction to Computer Programming with Python" or CS 0155 "Data Witchcraft" will give you a good preparation.

Notes for Incoming Students

Do you have little/no programming experience? Please read the following.
  1. I strongly recommend you to get a head-start on Python this summer. The class moves very fast, and you will be a lot more comfortable knowing a bit going in. The Python 3 Notes I put together has everything you need.
  2. Students with no prior programming experience do complete this course with great success, but they typically put in more hours than their experienced peers (which is understandable). For this reason, I encourage you to sign up for an additional 1-credit recitation. It will be run as LING 1901 Independent Study with the instructor, on a S/NC basis. 3 recitation sessions are being planned, all on Friday: (1) 10am -- 11:15am, (2) 11:30am -- 12:45pm, (3) 1pm -- 2:15pm. Please leave one of these slots open in your schedule.

Students are required to bring their own laptop to class. It should be running one of the following operating systems: Windows 10 (7 & 8 are also fine), Mac OS-X, and Linux (any distribution). Mobile or cloud-based machines such as Android/Apple tablets or Chromebooks are not suited.

Instructors

WhoPitt emailOffice hoursLocation
Na-Rae HannaraehanMon 11am-1pm & Wed 11am-noonG17 CL
Katherine Kairis (TA)kak275Mon & Wed 3-5pm2832 CL (linguistics grad/undergrad office)
Daniel Zheng (TA)daniel.zhengMon & Wed 7-9pm2832 CL (linguistics grad/undergrad office)
**Note: We are also available to meet by appointment.

Textbooks

[1] Language and Computers. Markus Dickinson et al. Wiley-Blackwell. 2012.
[2] Python tutorial: Python 3 Notes
[3] Natural Language Processing with Python. (updated edition based on Python 3 and NLTK 3) Steven Bird et al. O'Reilly Media.

Course Organization

Each meeting will comprise two parts: lecture and lab. In the first half of the class, topics presented in the textbook [1] Language and Computers will be covered in a lecture-and-discussion format. In the second half, students will get hands-on training on the basics of text processing using Python and Natural Language Toolkit (NLTK). Friday recitations (optional) will focus on the programming aspect: additional Python exercises, upcoming homework reviews, and individual help will be offered.

Assignment Schedule


  1. As a rule, there will always be a form of assignment between classes. There are two types: homework assignments and programming exercises. They are administered via CourweWeb and due before the beginning of the next class.
  2. Homework Assignments (40-60 points): These are assigned on most Thursdays. There will be around 11. These will comprise questions on lecture topics as well as programming problems.
  3. Programming Exercises (20 points): They are designed to help you learn and practice the programming aspect of the course. As long as you are keeping up with the course contents, you should be able to complete them in 1-2 hours, possibly more if you are new to programming. These will be given when there is no homework assignment, mostly on Tuesdays.
  4. Readings and Previews: In addition to the homework/exercise assignment, you will have book chapters and Python tutorials for the upcoming class to study beforehand.
  5. I will make every effort to post new assignments one week in advance. However, I might need to make some adjustments depending on the progress we make in classroom. Therefore, non-immediate assignments should be considered a DRAFT until it is finalized, which will happen 30 minutes post-class.
  6. Detailed assignment schedule is found on the Class Schedule page.

Exams, Requirements, Grading and Policies

Please read the Course Policies page.