tailieunhanh - Báo cáo hóa học: " Research Article NML Computation Algorithms for Tree-Structured Multinomial Bayesian Networks"

Tuyển tập báo cáo các nghiên cứu khoa học quốc tế ngành hóa học dành cho các bạn yêu hóa học tham khảo đề tài: Research Article NML Computation Algorithms for Tree-Structured Multinomial Bayesian Networks | Hindawi Publishing Corporation EURASIP Journal on Bioinformatics and Systems Biology Volume 2007 Article ID 90947 11 pages doi 2007 90947 Research Article NML Computation Algorithms for Tree-Structured Multinomial Bayesian Networks Petri Kontkanen Hannes Wettig and Petri Myllymaki Complex Systems Computation Group CoSCo Helsinki Institute for Information Technology HIIT . Box 68 Department of Computer Science FIN-00014 University of Helsinki Finland Received 1 March 2007 Accepted 30 July 2007 Recommended by Peter Grunwald Typical problems in bioinformatics involve large discrete datasets. Therefore in order to apply statistical methods in such domains it is important to develop efficient algorithms suitable for discrete data. The minimum description length MDL principle is a theoretically well-founded general framework for performing statistical inference. The mathematical formalization of MDL is based on the normalized maximum likelihood NML distribution which has several desirable theoretical properties. In the case of discrete data straightforward computation of the NML distribution requires exponential time with respect to the sample size since the definition involves a sum over all the possible data samples of a fixed size. In this paper we first review some existing algorithms for efficient NML computation in the case of multinomial and naive Bayes model families. Then we proceed by extending these algorithms to more complex tree-structured Bayesian networks. Copyright 2007 Petri Kontkanen et al. This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use distribution and reproduction in any medium provided the original work is properly cited. 1. INTRODUCTION Many problems in bioinformatics can be cast as model class selection tasks that is as tasks of selecting among a set of competing mathematical explanations the one that best describes a given sample of data. Typical examples of this

TÀI LIỆU LIÊN QUAN