Fall 2020: CS 3750 - Advanced Topics in Machine Learning
Integrating Scientific Theory with Machine Learning

Time
Tuesday, Thursday
9:25AM - 10:40AM
Zoom Link: https://pitt.zoom.us/j/91677876958
Office hour: Thursday 8:20AM - 9:20AM, or by appointment


Instructor
Xiaowei Jia
xiaowei (at) pitt.edu
www.pitt.edu/~xiaowei
5413 Sennott Square


Summary: Machine learning (ML) methods have found immense success in extracting complex knowledge from "Internet-scale" data. Given their accomplishments in commercial applications, there is a huge interest to see if machine learning (ML) methods can accelerate knowledge discovery in scientific domains that have traditionally progressed via (scientific) physics-based models. This course will systematically investigate limitations of traditional ML methods when applied to real-world scientific problems, which involve incomplete/imperfect data with non-stationary and chaotic behavior. To overcome their limitations, this course will also present multiple ways to integrate physical knowledge with ML models. The methods will be illustrated using examples from diverse scientific domains such as climate science, hydrology, fluid dynamics, aerospace, etc. Our course will provide hands-on experience through course projects and research presentations.
Recommended: Basic knowledge in Machine Learning; Basic proficiency in Python and Tensorflow/Pytorch.

Grading policy: 50% paper presentation (25% each) + 50% course project

Schedule
  1. 08/20-08/27: Course overview and preliminaries
    • 08/20, 08/25: ML review: classification, overfitting, neural networks (slides, video 1, video 2)
    • 08/27: ML applications in scientific domains (slides, video)
  2. 09/01-09/08: Knowledge-guided model architecture
  3. 09/10-09/15: Design of ML loss functions
  4. 09/17: Downscaling:
  5. Reading 1: Deep learning methods for super-resolution reconstruction of turbulent flows
    Reading 2: DeepSD: Generating High Resolution Climate Change Projections through Single Image Super-Resolution
  6. 09/22-09/24: Model initialization and transfer learning
  7. 09/29: Guest lecture: Process-Guided Meta Transfer Learning for Predicting Temperature of Unmonitored Lake Systems (Jared Willard) (video)
    Dataset: River basin, Lake modeling, Traffic prediction, Slides
  8. 10/01:
  9. Reading 1: Predicting AC Optimal Power Flows: Combining Deep Learning and Lagrangian Dual Methods
    Reading 2: Using Simulation and Domain Adaptation to Improve Efficiency of Deep Robotic Grasping
  10. 10/06-10/08: Other modeling approaches
  11. 10/13-10/15: Uncertainty quantification
  12. 10/20-10/22: Generative ML model
  13. 10/27-10/29: Inverse modeling
  14. 11/03: Discovery and solving underlying PDEs
    Reading 1: Solving high-dimensional partial differential equations using deep learning
    Reading 2: Data-driven discovery of partial differential equations
  15. 11/05:Guest lecture: ML in healthcare.
  16. 11/10-11/19: Project presentations

Other resources
  1. Datasets: TBA.
  2. Reference:


Academic integrity
All assignment submissions must be the sole work of each individual student. Students may not read or copy another student's solutions or share their own solutions with other students. Students may not review solutions from students who have taken the course in previous years. Submissions that are substantively similar will be considered cheating by all students involved, and as such, students must be mindful not to post their code publicly. The use of books and online resources is allowed, but must be credited in submissions, and material may not be copied verbatim. Any use of electronics or other resources during an examination will be considered cheating.

If you have any doubts about whether a particular action may be construed as cheating, ask the instructor for clarification before you do it. The instructor will make the final determination of what is considered cheating.

Cheating in this course will result in a grade of F for the course and may be subject to further disciplinary action.

Using an open-source codebase is accepted, but you must explicitly cite the source, especially following the owner's guideline if it exists. For any writing involved in the project, plagiarism is strictly prohibited. If you are unclear whether your work will be considered as plagiarism, ask the instructor before submitting or presenting the work.

University statement
If you have a disability for which you are or may be requesting an accommodation, you are encouraged to contact both your instructor and the Office of Disability Resources and Services, 140 William Pitt Union, at 412-648-7890 or 412-383-7355 (TTY) as early as possible, but no later than the fourth week of the term or visit the Office of Disability Resources website as early as possible, but no later than the 4th week of the term. DRS will verify your disability and determine reasonable accommodations for this course.