I.E. 1062/2062: DATA MINING
(Fall 2011-2012)


INSTRUCTOR:

Dr. Jayant Rajgopal
1039 Benedum Hall
Telephone No. 624-9840,
e-mail :rajgopal@pitt.edu
URL for this web page: http://www.pitt.edu/~jrclass/datamining/

LECTURES:

Room 1021 Benedum Hall

Tue./Thu.: 9:30 AM - 10:45 AM

TEXT:

Introduction to Data Mining, by Pang-Ning Tan, Michael Steinbach and Vipin Kumar, Addison-Wesley, Boston (2006).

NOTES:

Class Notes are required.  Please download, print and bring to class with you: Notes (zipped folder - password required)

REFERENCES:

The following references will all be on reserve in the Engineering Library:
  • Data Mining: Practical Machine Learning Tools and Techniques (2nd Ed.), by Ian H. Witten and Eibe Frank, Morgan Kaufmann Publishers, San Francisco (2005).
  • Data Mining: Concepts and Techniques, by Jiawei Han and Micheline Kamber, Morgan Kaufmann Publishers, San Francisco (2006).
  • Data Mining Techniques for Marketing, Sales and Customer Support, by Michael J. A. Berry and Gordon Linoff , John Wiley & Sons, New York (2004)
  • SOFTWARE:

    Overview of IBM SPSS Modeler 14.2 (previously known as Clementine)

    CONTENT:

    This is an introductory course in Data Mining.  The objective is to introduce the student to the area of data mining and overview the important techniques associated with the subject.  The tentative list of topics includes data mining applications, an overview of data warehousing, inputs and data preparation, knowledge representation, evaluation of learning, and techniques for classification, association and clustering.  Specific algorithms inlcude decision trees; Bayesian learning; covering algorithms for classification rules; instance-based learning; backpropagation and artificial neural networks; the Apriori algorithm; FP-growth; k-means method; agglomerative, divisive and probability based clustering.  Please refer to the link alongside for a tentative list of topics.

    GRADING:

    On the basis of two open-book examinations, a term paper, homework and class discussions.



    A very famous data miner...



    HOMEWORK & ANNOUNCEMENTS

    DATA FILES
    cancer.csv
    zoo.csv
    zoo1.csv
    zoo2.csv
    datamine.csv
    hwdata.csv
    datamine-symbolic.csv
    usage.csv