Learning Bayesian network parameters from small data sets:
Application of Noisy-OR gates
- Authors:
-
Agnieszka Onisko
Bialystok University of Technology
Institute of Computer Science
Bialystok, 15-351, Poland
e-mail:
aonisko@ii.pb.bialystok.pl
-
Marek J. Druzdzel
Decision Systems Laboratory
School of Information Sciences
and
Intelligent Systems Program
University of Pittsburgh
e-mail: marek@sis.pitt.edu
-
Hanna Wasyluk
The Medical Center of Postgraduate Education
Warsaw, Marymoncka 99, Poland
e-mail: hwasyluk@cmkp.edu.pl
-
Abstract:
-
Existing data sets of cases can significantly reduce the knowledge
engineering effort required to parameterize Bayesian networks.
Unfortunately, when a data set is small, many conditioning cases
are represented by too few or no data records and they do not
offer sufficient basis for learning conditional probability
distributions.
We propose a method that uses Noisy-OR gates to reduce the data
requirements in learning conditional probabilities.
We test our method on HEPAR II, a model for diagnosis of liver
disorders, whose parameters are extracted from a real, small set
of patient records.
Diagnostic accuracy of the multiple-disorder model enhanced with
the Noisy-OR parameters was 6.7% better than the accuracy of the
plain multiple-disorder model and 14.3% better than a single-disorder
diagnosis model.
The full paper is available in
Compressed PostScript (275KB)
and
PDF (339KB)
formats.
Back to list of publications
Back to Marek's home page
marek@sis.pitt.edu /
Last update: 9 May 2005