Learning Bayesian network parameters from small data sets:
Application of Noisy-OR gates
- Authors:
-
Agnieszka Onisko
Bialystok University of Technology
Institute of Computer Science
Bialystok, 15-351, Poland
e-mail:
aonisko@ii.pb.bialystok.pl
FAX: (085) 422-393
-
Marek J. Druzdzel
Decision Systems Laboratory
School of Information Sciences
and
Intelligent Systems Program
University of Pittsburgh
e-mail: marek@sis.pitt.edu
-
Hanna Wasyluk
The Medical Center of Postgraduate Education
and Institute of Biocybernetics
and Biomedical Engineering,
Polish Academy of Sciences
Warsaw, Marymoncka 99, Poland
e-mail: hwasyluk@cmkp.edu.pl
-
Abstract:
-
Existing data sets of cases can significantly reduce the knowledge
engineering effort required to parameterize Bayesian networks.
Unfortunately, when a data set is small, many conditioning cases
are represented by too few or no data records and they do not
offer sufficient basis for learning conditional probability
distributions.
We propose a method that uses Noisy-OR gates to reduce the data
requirements in learning conditional probabilities.
We test our method on Hepar II, a model for diagnosis of liver
disorders, whose parameters are extracted from a real, small set
of patient records. Diagnostic accuracy of the multiple-disorder
model enhanced with the Noisy-OR parameters was around 6% better
than the accuracy of the plain multiple-disorder model and 10%
better than the single-disorder diagnosis model.
The full paper is available in
Compressed PostScript (125KB)
and
PDF (160KB)
formats.
Back to list of publications
Back to Marek's home page
marek@sis.pitt.edu /
Last update: 11 May 2005