tailieunhanh - Báo cáo sinh học: "ANMM4CBR: a case-based reasoning method for gene expression data classification"

Tuyển tập các báo cáo nghiên cứu về sinh học được đăng trên tạp chí y học Molecular Biology cung cấp cho các bạn kiến thức về ngành sinh học đề tài: ANMM4CBR: a case-based reasoning method for gene expression data classification. | Yao and Li Algorithms for Molecular Biology 2010 5 14 http content 5 1 14 AMR ALGORITHMS FOR MOLECULAR BIOLOGY RESEARCH Open Access ANMM4CBR a case-based reasoning method for gene expression data classification Bangpeng Yao Shao Li Abstract Background Accurate classification of microarray data is critical for successful clinical diagnosis and treatment. The curse of dimensionality problem and noise in the data however undermines the performance of many algorithms. Method In order to obtain a robust classifier a novel Additive Nonparametric Margin Maximum for Case-Based Reasoning ANMM4CBR method is proposed in this article. ANMM4CBR employs a case-based reasoning CBR method for classification. CBR is a suitable paradigm for microarray analysis where the rules that define the domain knowledge are difficult to obtain because usually only a small number of training samples are available. Moreover in order to select the most informative genes we propose to perform feature selection via additively optimizing a nonparametric margin maximum criterion which is defined based on gene pre-selection and sample clustering. Our feature selection method is very robust to noise in the data. Results The effectiveness of our method is demonstrated on both simulated and real data sets. We show that the ANMM4CBR method performs better than some state-of-the-art methods such as support vector machine SVM and k nearest neighbor kNN especially when the data contains a high level of noise. Availability The source code is attached as an additional file of this paper. Background Recently gene microarray technology has become a fundamental tool in biomedical research enabling us to simultaneously observe the expression of thousands of genes on the transcriptional level. Two typical problems that researches want to solve using microarray data are 1 discovering informative genes for classification based on different cell-types or diseases 1 2 clustering and arranging genes .