The goal of our presentation is to describe typical problems that occur in building large medical Bayesian network models and to illustrate some practical techniques to overcome them. We have collaborated over the last couple of years on building diagnostic systems for diagnosis of liver disorders, processing of liver pathology data, and various epidemiological models. In our conference presentation, we will focus on typical problems encountered in model building from the point of view of both knowledge engineers (Druzdzel and Onisko) and medical experts (Schwartz, Dowling and Wasyluk). We will illustrate the knowledge engineering process with examples from our networks.
There are three important aspects of model building: building the graphical structure of the model, obtaining the numerical parameters for this structure, and verification. We believe that these three are closely related to each other and should be the focus of efforts from the very start of the process, which in turn should be iterative as opposed to oneshot.
The graphical structure of a Bayesian network, often downplayed in the literature, models important and robust structural properties of the domain direct interactions among the domain variables and, indirectly, conditional independencies among them. The structure is an important focus of the interaction with experts during all stages of model building and it is a good practice to make it follow the causal structure of the domain. This provides a common denominator among various experts and users of the model. The graphical structure, and in particular its connectivity, has a direct impact on the number of numerical parameters required to fully quantify the model. It is also the single most important factor in the accuracy of the ultimate model.
Through our work, we have learned to appreciate the importance of a reliable modeling tool that allows for easy construction, presentation, and modification of graphs, allows for documenting the model while building it, and supports hierarchical model structure that hides unnecessary detail in large models. While we have not encountered an ideal tool, we found GeNIe, developed at the Decision Systems Laboratory, University of Pittsburgh (described elsewhere in this volume and presented at the conference), suitable for developing medical applications.
The space allocated for this abstract does not allow us for an indepth coverage of our presentation. We will make a full-length paper covering the contents of our presentation available to interested readers at http://www.pitt.edu/~druzdzel/publ.html.
Acknowledgments
Our research was supported by the Air Force Office of Scientific
Research, grant F49620-97-1-0225, by the National Science Foundation
under Faculty Early Career Develop-ment (CAREER) Program, grant
IRI-9624629, by the Polish Committee for Scientific Research, grant
8T11E00811, by the Medical Centre of Postgraduate Education of Poland
grant 501-2-1-02-14/99, and by the Institute of Biocybernetics and
Biomedical Engineering Polish Academy of Sciences, grant 16/ST/99.
The paper is available in PostScript (61KB) and PDF (43KB) formats.